I Introduction
The fast Fourier transform (FFT) is a wellknown success story in engineering. From a numerical point of view, this is quite surprising since, at first sight, the FFT provides a mere firstorder approximation of the discretetime Fourier transform one is actually interested in. Upon closer inspection, it however turns out that approximations based on FFTs perform much better once the underlying signal is smooth [17]. Recently, nonlinear Fourier transforms (NFTs) have been gaining much attention in engineering areas such as fiberoptic communications [3, 35] and coastal engineering [24, 12]. NFTs are generalizations of the conventional Fourier transform that allow to solve specific nonlinear evolution equations in a way that is analog to how Fourier solved the heat equation [1]. The evolution of the nonlinear Fourier spectrum is, exactly like in the linear case, much simpler than the evolution of the original signal. NFTs also have unique data analysis capabilities that enable the detection of particlelike signal components known as solitons [31]. Recently, a nonlinear variant of the FFT has been derived [41, 42]. These type of fast NFTs (FNFTs) can provide up to secondorder accuracy [35]. Unfortunately, unlike for the FFT, the accuracy of the FNFTs in [41, 42, 35] remains (at most) secondorder even if the underlying signal is smooth. As a result, engineers currently have to strongly oversample even smooth signals in order to get reliable numerical results ([16, Sec. 4]). Several authors have proposed NFT algorithms with higher orders of accuracy, utilizing either RungeKutta [13, 45] or implicit Adams methods [36]. However, even though these methods have higher accuracy orders, they require very small sampling intervals in order to actually perform better than standard secondorder methods such as [10]. For practically relevant sampling intervals, they are typically not the best choice as they are slower and may even perform worse in these regimes. Numerical methods that provide better complexityaccuracy tradeoffs in practically relevant regimes have been an open problem until recently. In [14], the authors introduced a new numerical method that can compute the NFT with accuracies that are orders of magnitudes better than those of standard methods while having comparable run times. In this paper, the following three contributions are made.

Improved variants of the numerical NFT in [14] with even better complexityaccuracy tradeoffs are presented and investigated in detailed numerical examples.

Fast versions of the new numerical NFTs are derived.

A series acceleration technique is investigated to improve the complexityaccuracy tradeoff even further.
To give an illustration of the achieved gains, we point out that in one of our numerical examples, the bestperforming of our new methods achieves an accuracy that is hundred million times better than the standard secondorder method in [10] at a comparable run time.^{1}^{1}1Compare the error for CF in Fig. 6 with that of FCF_RE in Fig. 13 for the execution time sec. We remark that although the execution times are implementation specific, they still give a good indication of the advantages our proposed algorithm.
The paper is structured as follows. In Sec. II, we derive improved versions of our recently proposed numerical NFT in [14], and compare them with both conventional secondorder and other higherorder NFT algorithms in multiple numerical examples. Then, in Sec. III, we demonstrate how some of our new numerical NFTs can be made fast. The fast versions are compared with their slow counterparts. Next, in Sec. IV, we investigate how series acceleration techniques can improve the complexityaccuracy tradeoff of the fast NFT methods even further. Sec. V deals with a special part of the NFT that occurs only in some cases. The paper is finally concluded in Sec. VI.
Notation
Real numbers: ; ; Complex numbers: ; Complex numbers with positive imaginary part: ; Integers: ; ; Euler’s number: e; Real part: ; Complex conjugate: ; Floor function: ; Absolute value: ; Matrix exponential: expm; Matrix product: ; Matrix element in the th column and th row: ; Fourier transform of function , : ; Inverse Fourier transform of function , : .
Ia Nonlinear Fourier Transform (NFT)
In this section we describe the mathematical machinery beyond the NFT. Let denote the complex envelope of the electric field in an ideal optical fiber, whose evolution in normalized coordinates is described by the nonlinear Schrödinger equation (NSE) [2, Chap. 2]
(1) 
Here, denotes the location in the fiber and denotes retarded time. The parameter determines if the dispersion in the fiber is normal (1) or anomalous (+1). When Eq. 1 is referred to as the focusing NSE and for as the defocusing NSE. The NFT that solves the NSE (Eq. 1) is due to Zakharov and Shabat [46]. It transforms any signal that vanishes sufficiently fast for
from the timedomain to the nonlinear Fourier domain through the analysis of the linear ordinary differential equation (ODE)
(2) 
where , subject to the boundary conditions
(3)  
The term is a spectral parameter similar to in the Laplace domain. The matrix
is said to contain the eigenfunctions since Eq.
2can be rearranged into an eigenvalue equation with respect to
[1]. One can view the eigenfunctions as being scattered by as they move from to . Hence Eq. 2 is referred to as the scattering problem [46]. (Many problems in signal processing can be expressed through such scattering problems [11].) For Eq. 2 subject to boundary conditions Eq. 3, there exists a unique matrix called the scattering matrix(4) 
such that [1]
(5) 
The components , , and are known as the scattering data. The components and are sufficient to describe the signal completely. Their evolution along the dimension (along the length of the fiber) is simple [1, Sec. III]
(6)  
The reflection coefficient is then defined as for . The nonlinear Fourier spectrum can also contain a socalled discrete spectrum in the case of . It corresponds to the complex poles of the reflection coefficient in the upper halfplane, which are given by the zeros of . It is known that there are only finitely many such poles. The poles are also referred to as eigenvalues and a corresponding set of values are known as norming constants. There are different ways to define a nonlinear Fourier spectrum. One possibility is , [1]. The other is , [18]. In this paper we are primarily interested in computation of but some notes regarding computation of and the will also be given. Although we will illustrate our algorithms by applying them to the specific case of NFT of NSE with vanishing boundary condition, it should be noted that we are primarily presenting algorithms for solving equations similar to Eq. 2 [1, Eq. 2]. Hence the algorithms presented in this paper can be extended to compute the NFTs w.r.t to other nonlinear evolution equations such as the Kortewegde Vries equation [26].
Ii Numerical Computation of NFT using Higher Order Exponential Integrators
In this section we will start by outlining some assumptions required for the numerical methods that will be presented. We will give a brief overview of one of the approaches for computing the NFT and then talk specifically about our proposal to use a special class of socalled exponential integrators. To evaluate the methods, we describe examples and performance criteria. We will finally show and compare the results for various methods applied to the mentioned examples.
Iia Assumptions
The numerical computation of the NFT is carried out with discrete data samples on finite precision machines. Hence, we need to make the following assumptions:

The support of the potential is truncated to a finite interval, , instead of . The values are chosen such that the resulting truncation error is sufficiently small. The approximation is exact if .

The interval is divided into subintervals of width . We assume that the signal is sampled at the midpoints of each subinterval such that .
IiB Numerical Scattering
The main step in numerically computing the NFT is to solve the scattering problem in Eq. 2 for for different values of . We can view the subintervals as layers which scatter the eigenfunction as it moves from to . Using numerical ODE solvers we solve for an approximation of . By taking equal to the limit in Eq. 3 at , we can compute a numerical approximation of the scattering data and ultimately the reflection coefficient.
IiC Exponential Integrators
Almost any numerical method available in literature for solving ODEs can be used to solve for [13, 37]. However, we are particularly interested in socalled exponential type integrators. These methods have been shown to provide a very good tradeoff between accuracy and computational cost in several numerical benchmark problems while being fairly easy to implement, see [7] and references therein. We propose to use a special subclass known as commutatorfree quasiMagnus (CFQM) exponential integrators as some NFT algorithms based on these integrators turn out to have the special structure [41] that is needed to make them fast. We show this in Sec. III.
For rest of the paper we adopt the notation used in [8]. The results in [8] provide the following iterative scheme to compute a numerical approximation of :
(7)  
where
(8)  
By iterating from , we obtain the numerical approximation of that we need to compute the NFT. Denoting in short by we can write
(9)  
For an integrator , is the order and is the number of matrix exponentials required for each subinterval. An integrator of order has local error (error in each subinterval) of . Over () such subintervals i.e., over the interval , the global error will be . This distinction of local and global error will become important when we define the error metric used to compare the various algorithms in Sec.IID. is the number of points within each subinterval where the potential value needs to be known. The coefficients and are unique for each combination of and . We refer the reader to [8] for their derivation.
The integrator is also sometimes referred to as the exponential midpoint rule. It was used in the context of NFT for the defocusing NSE () by Yamada and Sakuda [43] and later by Boffetta and Osborne[10]. For , Eq. 7 reduces to
(10)  
This is applied repeatedly as in Eq. 9 to obtain . In [14] we investigated the possibility of using (first introduced in [34]) to obtain . We were able to show its advantage over when considering the tradeoff between an error and execution time. Here we investigate further in this direction and evaluate , and , which are fourth, fifth and sixthorder methods respectively. The CFQM exponential integrators require multiple nonequispaced points within each subinterval. However, it is unrealistic to assume that signal samples at such nonequispaced points can be obtained in a practical setting. Hence in [14]
we proposed the use of local cubicspline based interpolation to obtain the nonequispaced points from the midpoints of each subinterval. We will refer to the samples at these midpoints as the given samples. Although the use of a local cubicspline based interpolation was found to be sufficiently accurate for
, it is insufficient for higherorder methods. Here, we propose to combine bandlimited interpolation with the timeshift property of Fourier transform, i.e.,(11) 
to obtain the irregularly spaced samples. This interpolation rule is accurate when is sampled in accordance with the Nyquist criterion. As we are working with discrete signal samples, the interpolation can be implemented efficiently using fast Fourier transform (FFT).
IiD Error Metric and Numerical Examples
In this section, we compare the performance of CFQM exponential integrators , , , and , the twostep ImplicitAdams method (IA) introduced in [36] and the fourthorder RungeKutta method [13] (RK) for computation of the reflection coefficient. The fourthorder RungeKutta method () was the first fourthorder method used for computation of reflection coefficient in [13, 45]. We include the thirdorder twostep ImplicitAdams method () here as it was the first time a higherorder method was introduced in the context of fast nonlinear Fourier transform. The meaning of ”fast” will be made precise in Sec. III.
We are interested in evaluating the tradeoff between the increased accuracy and execution time due to use of higherorder methods. We asses the accuracy of different methods using the relative error
(12) 
where is the analytical reflection coefficient, is the numerically computed reflection coefficient and are equallyspaced points in . is a global error and hence it is expected to be for an integrator of order as explained in Sec. IIC. We fixed to ensure acceptable execution times for the numerical tests.
IiD1 Example 1: Hyperbolic Secant,
As the first numerical example we chose the signal . The closed form of the reflection coefficient is given by applying the frequencyshift property [44, Sec. D] to the analytical known reflection coefficient of the secanthyperbolic signal [29],
(13)  
where is the gamma function. The discrete spectrum is
(14)  
(15)  
(16) 
We set , , , and chose to ensure negligible truncation error.
IiD2 Example 2: Rational Reflection Coefficient with one pole,
The signal is given by [28]
(17) 
where , are scalar parameters and . We used and . The corresponding reflection coefficient is then known to be
(18) 
We set and chose . As the signal in Eq. 17 has a discontinuity, it cannot be interpolated well using bandlimited interpolation. We hence assume only in this example that we can sample the signal at exactly the points that we require.
IiD3 Example 3: Hyperbolic Secant,
The signal is given by
(19) 
where , and are scalar parameters. We used , and . The corresponding reflection coefficient is known to be [25]
(20) 
where is the gamma function, , , and . We set and chose .
The numerical methods were implemented in MATLAB using the exact representation of a matrix exponential in [6] for the CFQM exponential integrators. The execution times for each method were recorded using the MATLAB stopwatch function (tictoc). The execution times are of course implementationspecific, but we tried to make the implementations as comparable as possible. Fig. 2 shows the error measure , as defined in Eq. 12, for Example 1 for a range of relatively large sampling intervals . These results suggest that the higherorder methods can always be preferred over the lowerorder methods. The error measure for smaller sampling intervals for Example 1, 2 and 3 are shown in Figs. 2, 4 and 4 respectively. To ensure that the discontinuity in Example 2 is faithfully captured, we use a slightly different time grid for the RungeKutta method and the ImplicitAdams method, compared to the description in Sec. IIA. For all three examples, the slopes of the errorlines are in agreement with the order of each method except for IA. For smooth signals IA is seen to have an error of order four rather than the expected three. This observation is in agreement with [36, Fig. 2]. However, for the discontinuous signal of Example 2 we see thirdorder behavior as expected. We can also see that a higher generally corresponds to better accuracy (lower ) for the same . However, that is not necessarily obvious as seen in Fig. 2, where is more accurate than for larger . The advantage of using three exponentials () in instead of two in is also clear from the figures. The thirdorder ImplicitAdams method (IA with ) and fourthorder RungeKutta method (RK) may be more accurate than depending on the signal and other parameters, but have lower accuracy compared to and . The error reaches a minimum around and can start rising again as seen in Fig. 4 for . To understand this behavior, note that the error per layer in Eq. 7 is actually , where is a small constant due to finite precision effects that can normally be neglected. The total error is thus . As is becoming smaller and smaller, the second component also known as the arithmetic error, becomes dominant and eventually causes the total error to rise again [22].
For the CFQM exponential integrators, computation of the transfermatrix in Eq. 9 for each requires multiplications of matrices (Eq. 7) for given samples. If the reflection coefficient is to be computed at points then the overall computational complexity will be of the order , because . In Fig. 5 we plot the execution times of all the methods for Example 1. These execution times are representative for all examples. We can see that the execution time scales linearly with as expected since is kept constant in the numerical examples. The execution time of the CFQM exponential integrators is also a linear function of . The IA, RK and methods have similar execution times. To evaluate the tradeoff between the execution time and accuracy, we plot the execution time on the xaxis and the error on the yaxis in Fig. 6 for Example 1. To read such tradeoff plots we look at the error achieved by each method for a given amount of time. For Example 1 it turns out that provides the best tradeoff, but we can conclude that extra computation cost of the higherorder methods is generally justified by increased accuracy.
If is of the same order as then the computational complexity of the NFT will be . Although performing matrix multiplications of matrices is fast, the total cost of the NFT is significantly higher when compared to its linear analogue, the FFT, which has a complexity of only . So the natural question to ask is; Can the complexity be reduced?
Iii Fast Nonlinear Fourier Transform
The complexity of the new NFTs proposed in the previous section is quadratic in the number of given samples, . In this section, we demonstrate how the complexity of some of them can be reduced to by integrating them into the framework in [41, 42].
Iiia Fast Scattering Framework
In the framework proposed in [41], each matrix is approximated by a rational function matrix , where is a transformed coordinate. By substituting these approximations in Eq. 9, a rational function approximation of is obtained.
(21) 
We want to compute the coefficients of the numerator and denominator polynomials, respectively. A straightforward implementation of the matrix multiplication where each entry is a polynomial has a complexity of . Instead, by using a binarytree structure and FFTs [41, Alg. 1], the computational complexity can be reduced to . Hence it is referred to as fast scattering. The resulting rational function approximation is explicitly parametrized in and hence Eq. 9 is reduced to polynomial evaluations for each . To elaborate, we again restrict ourselves to of the form Eq. 7. Hence for , we need to approximate . The matrix exponential can be approximated to various orders of accuracy using rationals [9] or using splittingschemes such as the wellknown Strangsplitting. Higherorder splittingschemes were developed in [26] that map , the domain of the reflection coefficient, to on the unit circle, where is a real rational. Such mappings have a certain advantage when it comes to polynomial evaluations which we cover in Sec. IIIB. For a higherorder integrator, each is a product of matrix exponentials. For example let us look at . We can write
(22)  
Each of the two matrix exponentials can be approximated individually using a splittingscheme from [26]. can then be obtained as in Eq. 21. However, there are a few caveats which prevent extension of this idea to higherorder methods. The splittingschemes in [26] should not be applied to CFQM exponential integrators with complex coefficients . Complex coefficients mean that is no longer mapped to on the unit circle. Such a mapping is undesirable for polynomial evaluation as will be explained in Sec. IIIB. In addition, we do not even obtain a polynomial structure if there exists no such that is an integer power of this for all . Furthermore, if such a exists but only for high coprime integer powers, will consist of sparse polynomials of high degree, which can significantly hamper the computational advantage of using the approximation. Due to these reasons we restrict ourselves to fast implementations of and which will be referred to as and . The algorithm is already available as a part of the FNFT software library [40]. However, accuracy and execution times for it haven’t been published formally anywhere in literature. The algorithm is completely new. For both and we use the fourthorder accurate splitting [26, Eq. 20].
IiiB Fast Evaluation
Once we obtain the rational function approximation of in terms of numerator and denominator coefficients, we only have to evaluate the numerator and denominator functions for each value of in order to compute the reflection coefficient. The degree of the polynomials to be evaluated will be at least which can be in the range of . It is known that evaluation of such highdegree polynomials for large values of can be numerically problematic [42, Sec. IV.E]. However, by choosing the mapping , which maps the real line to the unit circle, the polynomials need to be evaluated on the unit circle where evaluation of even highdegree polynomials is numerically easier. The higherorder splitting schemes in [26] were developed with such a mapping in mind allowing for approximations of the matrix exponentials as rational functions in . Evaluating any polynomial of degree using Horner’s method has a complexity of . Hence for values of , the total cost of fast scattering followed by polynomial evaluation will be . Mapping to on the unit circle has an additional computational advantage. Let be a polynomial in of degree . Evaluation of at a point can be written as
(23) 
For points
on the unit circle, this is equivalent to taking the Chirp Ztransform (CZT) of the polynomial coefficients along an appropriately defined contour. The CZT can be computed efficiently using the algorithm in
[27] which utilizes FFT’s. We can also see Eq. 23 as a nonequispaced discrete Fourier transform (NDFT) of the polynomial coefficients which allows us to utilize efficient NFFT algorithms in [20] for evaluating the polynomial. If the number of evaluation points is in the same order of magnitude as , the complexity of evaluation becomes and hence the overall complexity of the fast nonlinear Fourier transform (FNFT) is .IiiC Numerical Examples
We implemented and using the fourthorder accurate splitting scheme [26, Eq. 20] and the fast polynomial multiplication algorithm [41, Alg. 1]. We used the NDFT routine from [20] for evaluating the polynomials. Now we compare the implementations of and presented in Sec. IID and their fast versions and . We plot the error versus the execution time for Example 1 in Fig. 7, for Example 2 in Fig. 8 and for Example 3 in Fig. 9. In the three figures we can see that the fast FCF algorithms achieve similar errors as their slow CF counterparts in a significantly shorter time. From the other viewpoint, for the same execution time, the FCF algorithms achieve significantly lower errors compared to CF algorithms. The tradeoff shifts further towards FCF algorithms as the number of evaluation points increases. outperforms in the tradeoff for both the examples which again highlights the advantage of using higherorder CFQM exponential integrators.
reaches a minimum error around and then starts rising. This is again attributed to the arithmetic error becoming the dominant source of error. We remark that in numerical tests the CZT was found to perform equally well as the NDFT before the error minimum but the error rise thereafter was significantly steeper.
Since the NFT is a nonlinear transform, it changes its form under scaling, and computing it typically becomes increasingly difficult when being scaled up [37]. Hence it is of interest to study scaling of error with increase in signal energy. To test the scaling we use Example 1 and sweep the signal amplitude from to in steps of while keeping all other parameters the same as before. As the timewindow remains the same, scaling the signal amplitude leads to directly proportional scaling of the signal energy. We compute the error for decreasing for each value of for the CF and FCF algorithms. We plot versus the sampling interval on a logscale for CF algorithms in Fig. 10 and for FCF algorithms in Fig. 11. Instead of plotting individual lines for each value of , we represent the amplitude using different shades of gray. As shown in the colourbar, lighter shades represents lower and darker shades represent higher . The stripes with a higher slope are the higherorder methods. The piecewise constant signal approximation becomes less accurate with increasing signal amplitude, which directly translates to a higher error for larger values of . All the four algorithms i.e., , , and show similar trends for the scaling of error with signal energy. The algorithm was compared with other methods in [37] (where it is referred to as BO), and they conclude that scales the best with increasing signal amplitude. Hence the results shown in Fig. 10 are very motivating as is seen to scale almost as well as . The error of approximations used in the FCF algorithms also depends on . However, comparing Fig. 10 and Fig. 11 we can see that the contribution of the approximation error is negligible. These results combined with the results in the tradeoff plots (Figs. 7, 8 and 9) make a strong case for the algorithm.
As an efficient FNFT algorithm can be combined with the recently proposed modulation [39, 19, 21] scheme to develop a complete NFT based fiberoptic communication system, accurate and fast computation of the scattering coefficient is of interest. Hence to test the performance of both the FCF algorithms in computation of coefficient, we define
(24) 
where is the analytically known and is the numerically computed scattering coefficient. For the numerical test we again use the signal from Example 1 as is known. We plot the error for both the FCF methods for decreasing sampling interval in Fig. 12. clearly outperforms even after considering the additional computational cost.
From the results of the numerical tests presented in this section it is clear that approximating in Eq. 9 using rational functions provides a significant computational advantage. However, as mentioned earlier we have had to restrict ourselves to fourthorder method . To further improve the accuracy and order of convergence while restricting ourselves to a fourthorder method, we explore the possibility of using series acceleration techniques in the next section.
Iv Series acceleration
Series acceleration methods are techniques for improving the rate of convergence of a series. In particular, they were used to improve an inverse NFT algorithm for the defocusing case in [15]. Series acceleration was also used to improve the accuracy of the splitstep Fourier algorithm [32] commonly used to solve the NSE directly. Here we study the applicability of a specific series acceleration technique known as the Richardson extrapolation [30] to the FCF algorithms.
Iva Richardson Extrapolation
Given an order numerical approximation method for that depends on the stepsize , we can write
(25) 
We assume that has series expansion in ,
(26) 
In Richardson extrapolation [30], we combine two numerical approximations and as follows,
(27) 
Using the series expansion, we find that the order of the new approximation is at least :
(28)  
We apply this idea to and to obtain and algorithms. We remark that series acceleration can also be applied to the slow algorithms in Sec. IIC.
IvB Numerical examples
We test and against and for all three examples. Since Richardson extrapolation requires us to compute two approximations, which increases the computational complexity, we again evaluate the complexityaccuracy tradeoff. We plot the error versus execution time curves for the three examples in the Figs. 13 to 15. In all figures we can see that the FCF_RE algorithms achieve slopes of rather than the expected slope of . This is an example of superconvergence [33]. Specifically, the error of decreases with slope four and that of decreases with slope six. As seen before in Sec. IID, the arithmetic error starts to dominate after a certain point and causes the error to rise. Although the executions times of FCF_RE algorithms are higher, the error achieved is almost an order of magnitude lower even for large . From the other viewpoint, for the same execution time, the FCF_RE algorithms achieve significantly lower errors compared to FCF algorithms. outperforms in the tradeoff for all the three examples again highlighting the advantage of using higherorder CFQM exponential integrators. These results suggest that Richardson extrapolation should be applied to improve the considered FNFT algorithms. The algorithm provides the best tradeoff among all the algorithms presented in this paper.
V Computing Eigenvalues
Recall that for the case of focusing NSE (), the nonlinear Fourier spectrum has two parts: a continuous and a discrete part. In this section, we are concerned with the numerical computation of the discrete part. We first mention some of the existing approaches and then show how one of them can be extended to work with the new fast higherorder NFT algorithms. We will also show that Richardson extrapolation can be applied to improve the accuracy at virtually no extra computational cost.
Va Existing approaches
Finding the eigenvalues consists of finding the complex upper halfplane roots of the function
. Most of the existing approaches can be classified into four main categories.

Search methods: Newton’s method.

Eigenmethods: Spectral methods based on the solution of a suitably designed eigenproblem [44].

Hybrid methods: Any combination of the above. Eigenmethods with rougher sampling can e.g. be used to find initial guesses for a search method [4].
Our proposed method will be a hybrid of a eigenmethod and a search method in the spirit of [4], where an eigenproblem is solved to obtain initial guesses for Newton’s method.
VB Proposed method
Remember that the discrete spectrum consists of eigenvalues, which are the zeros of in the complex upper halfplane (), and their associated norming constants. We start with discussing an approximation of that will be useful for locating the eigenvalues. From Eqs. 3, 4 and 5 we can write
(29) 
Over the finite interval using Eq. 9 we can see that
(30) 
Hence we hope that the zeros of are approximations of the zeros of if the signal truncation and discretization errors are small enough. In Sec. IIIA we explained how can be approximated by a rational function in a transformed coordinate . Hence we can further write
(31) 
where and are polynomials in . Let . Thus will have zeros. These zeros or roots of can be found using various methods [23]. Of these zeros, there should be (typically, ) values that are approximations of zeros of in .
Example
We would like to add clarity through a visual representation of the roots. We choose the signal from Example 1 with . We plot all the zeros of of with ‘x’ in Fig. 16. Here . We can then map these zeros back to obtain values of . These are plotted with ‘x’ in Fig. 17. From the definition of discrete spectrum, we can filter out all the values that are not in . We then filter further and keep only values of such that which is close to a limit imposed by the chosen mapping from . The filtered roots are plotted in Figs. 17 and 16 with ‘o’. For the chosen value of the set of eigenvalues is . From Fig. 17 we see that the values marked with ‘o’ are indeed approximations of the values in set . However, there is no guarantee that we will always be able to locate approximations for all values in as that depends on several factors, some of which are the signal magnitude , signal interval , stepsize and values of the eigenvalues themselves.
For the example chosen in the visual demonstration, the number of zeros is and the number of eigenvalues is . For the chosen mapping from , the values of interest will always lie inside the unit circle in Fig. 16 and most other spurious zeros of will lie on the unit circle. Even with the best eigenmethods available for polynomial rootfinding, which have a complexity of [5], execution time grows very steeply, making this approach infeasible for large . To reduce the complexity, it was suggested in [4] to subsample the given signal to reduce the dimensionality of the rootfinding problem. The algorithm is summarized below.
We now discuss the three stages of the algorithm in detail.

Root finding from subsampled signal
The given signal is subsampled to give with samples. The corresponding stepsize is . There are no results for the minimum number of samples that guarantee that all eigenvalues will be found. One choice can be based on limiting the overall computational complexity to , which is the complexity for the reflection coefficient. For a rootfinding algorithm with complexity, we choose to use samples. The polynomial is then built from these samples. For , the nonequispaced samples should be obtained from the original samples and not the samples. An eigenmethod is then used to find all zeros of . We used the algorithm in [5]. The values of are mapped backed to and filtered to remove implausible values. 
Root refinement using Newton method
The Newton method based on the slow CF methods is used for root refinement. The derivative is calculated numerically along with as in [10] using all the given samples . The values of that remain after filtering in the previous step are used as the initial guesses for the Newton method. We chose to stop the iterations if the change in value goes below or if a maximum of 15 iterations is reached. The resulting roots are filtered again. 
Richardson extrapolation
We pair the roots resulting from the Newton step, , with the corresponding initial guesses . Then, we apply Richardson extrapolation:(32) is then an improved approximation and constitutes the discrete part of the FCF_RE algorithm. It may so happen that more than one converge to the same . In such cases the value closest to should be used for Richardson extrapolation. The other values that also converged to the same should be treated as spurious values and discarded.
The numerical algorithms may not find particular eigenvalues or find spurious ones. Let be the set of approximations found by an algorithm. To penalize both missed values and incorrect spurious values at the same time, we define the error
(33)  
Note that the first term in the outer maximum grows large if an algorithm fails to approximate a part of the set while the second term becomes large if an algorithm finds spurious values that have no correspondence with values in . is expected to be of order for an algorithm of order .
VC Numerical Example
In this section, we compare different variants of our proposed algorithm using Example 1. We compute the error for the following three types of algorithms:

Discrete part of FCF algorithms. An eigenmethod is applied to the approximation built using all samples. No subsampling is used.

Discrete part of FCF algorithms with subsampling. Only steps 1 and 2 of the algorithm mentioned above.

Discrete part of FCF_RE algorithms. All the three steps mentioned above.
To demonstrate the effect of subsampling, we show in Fig. 19 the errors for the second and fourthorder algorithms of types 1 and 2. For the errors are high either due to failure to find approximations close to the actual eigenvalues or spurious values. For , of type 1 and of type 2 find exactly five values that are close approximations of the values in . However of type 1 and of type 2 find good approximations only for . The error of algorithms decreases with slope two and that of algorithms decreases with slope four as expected from the order of the underlying numerical schemes.
In Fig. 19 we show the errors for the second and fourthorder algorithms of type 2 and 3 to indicate the advantage of the extrapolation step. The extrapolation step improves the approximation significantly for while adding negligible computation cost to the algorithm. However, there is only minor improvement in case of over .
In Fig. 20 we plot the execution times for the FCF algorithms of types 1 and 2. The execution times of algorithms of type 3 are almost the same as those of type 2. For type 1 algorithms, these times include the time required to build and the time taken by the rootfinder. For algorithms of type 2, the additional time required for rootrefinement by Newton’s method is also included. Even with subsampling, we see that the execution times are an order of magnitude higher than the execution times for the continuous part. The FCF_RE algorithms seem to provide the best tradeoff between accuracy and computation cost similar to the case of continuous part. The overall computational complexity may be decreased by using alternative methods to find the initial guesses.
Vi Conclusion
In this paper, we proposed new higherorder nonlinear Fourier transform algorithms based on a special class of exponential integrators. We also showed how some of these algorithms can be made fast using special higherorder exponential splittings. The accuracy of these fast algorithms was improved even further using Richardson extrapolation. Numerical demonstrations showed that the proposed algorithms are highly accurate and provide much better complexityaccuracy tradeoffs than existing algorithms. In the future we plan to integrate the algorithms from this paper into the open source software library FNFT [40].
References
 [1] M. J. Ablowitz, D. J. Kaup, A. C. Newell, and S. Harvey, “The Inverse Scattering TransformFourier Analysis for Nonlinear Problems,” Studies in Applied Mathematics, vol. 53, no. 4, pp. 249–315, 1974.
 [2] G. P. Agrawal, FiberOptic Communication Systems, 4th ed. Hoboken, N.J., USA: Wiley, 2013.
 [3] E. Agrell et al., “Roadmap of Optical Communications,” Journal of Optics, vol. 18, no. 6, p. 063002, 2016.
 [4] V. Aref, S. T. Le, and H. Buelow, “Modulation Over Nonlinear Fourier Spectrum: Continuous and Discrete Spectrum,” Journal of Lightwave Technology, vol. 36, no. 6, pp. 1289–1295, March 2018.
 [5] J. Aurentz, T. Mach, R. Vandebril, and D. Watkins, “Fast and backward stable computation of roots of polynomials,” SIAM Journal on Matrix Analysis and Applications, vol. 36, no. 3, pp. 942–973, 2015. [Online]. Available: https://doi.org/10.1137/140983434
 [6] D. S. Bernstein and W. So, “Some Explicit Formulas for the Matrix Exponential,” IEEE Transactions on Automatic Control, vol. 38, no. 8, pp. 1228–1232, Aug 1993.
 [7] S. Blanes, F. Casas, J. Oteo, and J. Ros, “The Magnus Expansion and Some of its Applications,” Physics Reports, vol. 470, no. 5, pp. 151 – 238, 2009.
 [8] S. Blanes, F. Casas, and M. Thalhammer, “Highorder Commutatorfree QuasiMagnus Exponential Integrators for Nonautonomous Linear Evolution Equations,” Computer Physics Communications, vol. 220, pp. 243 – 262, 2017.
 [9] J. L. Blue and H. K. Gummel, “Rational Approximations to Matrix Exponential for Systems of Stiff Differential Equations,” Journal of Computational Physics, vol. 5, no. 1, pp. 70 – 83, 1970.
 [10] G. Boffetta and A. R. Osborne, “Computation of the Direct Scattering Transform for the Nonlinear Schrödinger Equation,” Journal of Computational Physics, vol. 102, no. 2, pp. 252–264, 1992.
 [11] A. Bruckstein and T. Kailath, “An Inverse Scattering Framework for Several Problems in Signal Processing,” IEEE ASSP Magazine, vol. 4, no. 1, pp. 6–20, Jan 1987.
 [12] M. Brühl, “Direct and Inverse Nonlinear Fourier Transform based on the KortewegdeVries Equation (KdVNLFT)  a Spectral Analysis of Nonlinear Surface Waves in Shallow Water,” Ph.D. dissertation, Feb 2014.
 [13] S. Burtsev, R. Camassa, and I. Timofeyev, “Numerical Algorithms for the Direct Spectral Transform with Applications to Nonlinear Schrödinger Type Systems,” Journal of Computational Physics, vol. 147, no. 1, pp. 166–186, 1998.
 [14] S. Chimmalgi, P. J. Prins, and S. Wahls, “Nonlinear Fourier Transform Algorithm Using a Higher Order Exponential Integrator,” in Advanced Photonics 2018. Optical Society of America, 2018, p. SpM4G.5.
 [15] Y. Choi, J. Chun, T. Kim, and J. Bae, “The Schur Algorithm Applied to the OneDimensional Continuous Inverse Scattering Problem,” IEEE Transactions on Signal Processing, vol. 61, no. 13, pp. 3311–3320, 2013.
 [16] S. Civelli, S. K. Turitsyn, M. Secondini, and J. E. Prilepsky, “Polarizationmultiplexed Nonlinear Inverse Synthesis with Standard and Reducedcomplexity NFT Processing,” Optics Express, vol. 26, no. 13, pp. 17 360–17 377, 2018.
 [17] C. L. Epstein, “How well does the finite Fourier transform approximate the Fourier transform?” Communications on Pure and Applied Mathematics, vol. 58, no. 10, pp. 1421–1435, 2005.
 [18] L. D. Faddeev and L. Takhtajan, Hamiltonian Methods in the Theory of Solitons. SpringerVerlag Berlin Heidelberg, 2007.
 [19] T. Gui, G. Zhou, C. Lu, A. P. T. Lau, and S. Wahls, “Nonlinear Frequency Division Multiplexing with bModulation: Shifting the Energy Barrier,” Opt. Express, vol. 26, no. 21, pp. 27 978–27 990, Oct 2018.
 [20] J. Keiner, S. Kunis, and D. Potts, “Using NFFT 3 A Software Library for Various Nonequispaced Fast Fourier Transforms,” ACM Transactions on Mathematical Software, vol. 36, no. 4, pp. 19:1–19:30, Aug. 2009.
 [21] S. T. Le, K. Schuh, F. Buchali, and H. Buelow, “100 Gbps bmodulated Nonlinear Frequency Division Multiplexed Transmission,” in 2018 Optical Fiber Communications Conference and Exposition (OFC), March 2018, pp. 1–3.
 [22] E. H.L. Liu, “Fundamental Methods of Numerical Extrapolation with Applications,” MIT open courseware, Massachusetts Institute Of Technology, 2006, accessed: 1112018. [Online]. Available: http://web.mit.edu/ehliu/Public/Spring2006/18.304/extrapolation.pdf
 [23] J. McNamee and V. Y. Pan, “Efficient Polynomial Rootrefiners: A Survey and New Record Efficiency Estimates,” Computers & Mathematics with Applications, vol. 63, no. 1, pp. 239 – 254, 2012.
 [24] A. R. Osborne, “Nonlinear Fourier Methods for Ocean Waves,” Procedia IUTAM, vol. 26, pp. 112–123, 2018.
 [25] E. V. Podivilov, D. A. Shapiro, and D. A. Trubitsyn, “Exactly Solvable Profiles of Quasirectangular Bragg Filter with Dispersion Compensation,” Journal of Optics A: Pure and Applied Optics, vol. 8, no. 9, p. 788, 2006.
 [26] P. J. Prins and S. Wahls, “Higher Order Exponential Splittings for the Fast NonLinear Fourier Transform of the Kortewegde Vries Equation,” in 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
 [27] L. Rabiner, R. Schafer, and C. Rader, “The chirp ztransform algorithm,” IEEE Transactions on Audio and Electroacoustics, vol. 17, no. 2, pp. 86–92, June 1969.
 [28] D. Rourke and P. Morris, “HalfSolitons as Solutions to the ZakharovShabat Eigenvalue Problem for Rational Reflection Coefficient with Application in the Design of Selective Pulses in Nuclear Magnetic Resonance,” Phys. Rev. A, vol. 46, no. 7, pp. 3631––3636, 1992.
 [29] J. Satsuma and N. Yajima, “B. Initial Value Problems of OneDimensional SelfModulation of Nonlinear Waves in Dispersive Media,” Progress of Theoretical Physics Supplement, vol. 55, pp. 284–306, 1974.
 [30] A. Sidi, Practical Extrapolation Methods: Theory and Applications, 1st ed. New York, NY, USA: Cambridge University Press, 2002.
 [31] A. C. Singer, A. V. Oppenheim, and G. W. Wornell, “Detection and Estimation of Multiplexed Soliton Signals,” IEEE Transactions on Signal Processing, vol. 47, no. 10, pp. 2768–2782, Oct 1999.
 [32] O. V. Sinkin, R. Holzlohner, J. Zweck, and C. R. Menyuk, “Optimization of the Splitstep Fourier Method in Modeling Opticalfiber Communications Systems,” Journal of Lightwave Technology, vol. 21, no. 1, pp. 61–68, Jan 2003.
 [33] I. H. Sloan, Superconvergence. Boston, MA: Springer US, 1990, pp. 35–70.
 [34] M. Thalhammer, “A Fourthorder Commutatorfree Exponential Integrator for Nonautonomous Differential Equations,” SIAM Journal on Numerical Analysis, vol. 44, no. 2, pp. 851–864, 2006. [Online]. Available: https://doi.org/10.1137/05063042
 [35] S. K. Turitsyn et al., “Nonlinear Fourier Transform for Optical Data Processing and Transmission: Advances and Perspectives,” Optica, vol. 4, no. 3, pp. 307–322, Mar 2017.
 [36] V. Vaibhav, “Higher Order Convergent Fast Nonlinear Fourier Transform,” IEEE Photonics Technology Letters, vol. 30, no. 8, pp. 700–703, April 2018.
 [37] A. Vasylchenkova, J. Prilepsky, D. Shepelsky, and A. Chattopadhyay, “Direct Nonlinear Fourier Transform Algorithms for the Computation of Solitonic Spectra in Focusing Nonlinear Schrödinger Equation,” Communications in Nonlinear Science and Numerical Simulation, vol. 68, pp. 347 – 371, 2019.
 [38] A. Vasylchenkova, J. E. Prilepsky, and S. K. Turitsyn, “Contour Integrals for Numerical Computation of Discrete Eigenvalues in the Zakharov&Shabat Problem,” Opt. Lett., vol. 43, no. 15, pp. 3690–3693, Aug 2018.
 [39] S. Wahls, “Generation of Timelimited Signals in the Nonlinear Fourier Domain via bModulation,” in 2017 European Conference on Optical Communication (ECOC), Sept 2017, pp. 1–3.
 [40] S. Wahls, S. Chimmalgi, and P. Prins, “FNFT: A Software Library for Computing Nonlinear Fourier Transforms,” The Journal of Open Source Software, vol. 3, p. 597, 2018.
 [41] S. Wahls and H. V. Poor, “Introducing the Fast Nonlinear Fourier Transform,” in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2013, pp. 5780–5784.
 [42] ——, “Fast Numerical Nonlinear Fourier Transforms,” IEEE Transactions on Information Theory, vol. 61, no. 12, pp. 6957–6974, Dec 2015.
 [43] M. Yamada and K. Sakuda, “Analysis of Almostperiodic Distributed Feedback Slab Waveguides via a Fundamental Matrix Approach,” Appl. Opt., vol. 26, no. 16, pp. 3474–3478, Aug 1987.
 [44] M. I. Yousefi and F. R. Kschischang, “Information Transmission Using the Nonlinear Fourier Transform, Part I: Mathematical Tools,” IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 4312–4328, July 2014.
 [45] ——, “Information Transmission Using the Nonlinear Fourier Transform, Part II: Numerical Methods,” IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 4329–4345, July 2014.
 [46] V. Zakharov and A. Shabat, “Exact Theory of TwoDimensional SelfFocusing and OneDimensional SelfModulation of Waves in Nonlinear Media,” Soviet Physics JETP, vol. 34, no. 1, p. 62, January 1972.
Comments
There are no comments yet.