FDTD: solving 1+1D delay PDE

by   Yao-Lung L. Fang, et al.

We present a proof of concept for adapting the finite-difference time-domain method (FDTD) for solving a 1+1D complex-valued, delay partial differential equation (PDE) that emerges in the study of waveguide quantum electrodynamics (QED). The delay term exists in both spatial and temporal directions, rendering the conventional approaches such as the method of lines inapplicable. We show that by properly designing the grid and by using the exact (partial) solution as the boundary condition, the delay PDE can be numerically solved. Our code provides a numerically exact solution to the time-dependent multi-photon scattering problem in waveguide QED. The program is written in C and open-sourced on GitHub.



There are no comments yet.


page 3

page 13


FDTD: solving 1+1D delay PDE in parallel

We present a proof of concept for solving a 1+1D complex-valued, delay p...

Data-Driven Deep Learning of Partial Differential Equations in Modal Space

We present a framework for recovering/approximating unknown time-depende...

Probabilistic Numerical Method of Lines for Time-Dependent Partial Differential Equations

This work develops a class of probabilistic algorithms for the numerical...

DiffusionNet: Accelerating the solution of Time-Dependent partial differential equations using deep learning

We present our deep learning framework to solve and accelerate the Time-...

A Latent space solver for PDE generalization

In this work we propose a hybrid solver to solve partial differential eq...

Improved Penalty Algorithm for Mixed Integer PDE Constrained Optimization (MIPDECO) Problems

Optimal control problems including partial differential equation (PDE) a...

P_N-Method for Multiple Scattering in Participating Media

Rendering highly scattering participating media using brute force path t...

Code Repositories


Solve 1+1D delay PDE using FDTD

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Waveguide quantum electrodynamics (QED) concerns the interaction between one-dimensional (1D) waveguide photons and local emitters (atoms, qubits, etc.)

(1; 2; 3; 4; 5). In the cases where a coherent feedback loop is formed due to the presence of, for example, multiple distant emitters or a perfect mirror terminating the waveguide, the propagation of photons needs to be taken into account rigorously if the time of flight between distant objects is non-negligible compared to the decay time of the qubits (6; 7; 8; 9; 10; 11).

The study of time evolution (i.e., dynamics) in waveguide QED has drawn considerable attention (12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22), since it offers full information of light-matter interaction based on which more precise control of the system or detailed analysis of quantum non-Markovianity (23; 24; 25) can be performed. Insight into the non-equilibrium physics can also be obtained. While powerful numerical approaches, such as density matrix renormalization group (DMRG) (26; 27; 28) whose accuracy for solving 1D systems is undoubtedly demonstrated, have been applied recently to waveguide QED problems with feedback loops (11; 22), in the present paper and Ref. (29) we show that simply solving the time-dependent Schrödinger equation can provide a perhaps more natural and intuitive viewpoint on the scattering physics. However, in the presence of delay, the Schrödinger equation leads to a delay partial differential equation (PDE), solving which is a significant technical challenge to be addressed by this paper.

As a concrete example, we consider a two-level system (2LS) coupled to a semi-infinite waveguide, one end of which is terminated by a perfect mirror (8; 29). Under the rotating-wave approximation, the number of excitations in a waveguide-QED system is conserved. Therefore, one may partition the full Hilbert space into different number sectors. While the dynamics in the one-excitation sector is described by a delay differential equation (DDE)111To be more consistent with respect to its PDE counterpart, one could use the acronym “delay ODE”, which stands for delay ordinary differential equation. However, we use DDE to follow the convention in the literature.

, a one-variable ordinary differential equation that has a delay term

(29; 30; 31) and can be solved straightforwardly (32), the multi-excitation sectors require nontrivial care. In particular, in the two-excitation sector there can be either two propagating photons in the waveguide (wavefunction denoted by hereafter), or one flying and the other absorbed by the 2LS (); note that a 2LS can only hold one photon at a time. Therefore, starting from the Schrödinger equation and unfolding the half space, we arrive at a 1+1D delay PDE that describes the (complex-valued) time-dependent wavefunction, , of the 2LS plus a photon at position :


where is the atom-mirror separation, () is the 2LS frequency (decay rate), is the step function, is the time-dependent two-photon wavefunction (one at position , another at ), and we set . By solving Eq. (1) for , the full dynamics of the system can be completely determined, since the two-photon wavefunction can be written in terms of :


Note that is symmetric under the exchange of and due to bosonic statistics. The detailed discussion of this problem and the derivation of above equations are reported in (29).

Before proceeding, we first reiterate that the focus of the present paper is on delay PDE, not DDE. As stated above, solving DDE subject to appropriate initial conditions, in particular linear DDE of the form , is simple and in fact standard (32). The solution usually consists of an infinite sum of piecewise functions whose interval is dictated by the delay . For solving more complicated cases, such as a nonlinear DDE, DDE with time-dependent coefficients, a system of DDEs, and possibly a combination of these, there are also sophisticated numerical routines provided in, e.g., Matlab and Mathematica. Second, while delay PDEs have emerged in some scientific and engineering contexts and been numerically studied using, for example, the method of lines (33), we emphasize that as far as we understand, those approaches cannot be directly applied to our problem, as the delay term in Eq. (1) lies in both and dimensions (i.e., it’s a non-local delay), contrary to the common situation of delay PDE in which only one of the dimensions is delayed (33). In other words, Eq. (1) cannot be converted to a system of ordinary differential equations (ODE) that is discretized in , and then be solved by an ODE/DDE solver along . Furthermore, to the best of our knowledge there is no general-purpose solvers for delay PDE. As a result, we adapt and implement the finite-difference time-domain (FDTD) method for solving the delay PDE (1) and demonstrate its validity in this paper.

Moreover, because of the feedback loop the system is highly “non-Markovian”, a jargon widely used in the community of open quantum systems (23; 24; 25) meaning the quantum system has a dependence on its past history. Since solving the wavefunction requires looking up the system’s memory (values of solved at earlier times), as we will see this imposes stringent constraints on how the problem can be parallelized. In the paper, we present, implement, and benchmark two different multi-thread approaches, swarm and wavefront, to address this issue.

The main purpose of this work is therefore three-fold: (i) to numerically solve Eq. (1) using FDTD; (ii) to provide a proof of concept that FDTD works well for tackling complex-valued, spatially non-local delay PDE and that certain degree of parallelism can be achieved; (iii) to present a numerically exact solution to the time-dependent, multi-photon scattering problem in waveguide QED.

2 Method and targeted problems

FDTD is widely used by engineers in antenna designs, computational electrodynamics, plasmonics, etc. Below we briefly discuss how FDTD works, and refer interested readers to Refs. (34; 35) for the details.

Like most of differential-equation solving methods, FDTD discretizes the spacetime, and different discretization schemes have their own advantages and disadvantages. For simplicity we choose a square lattice, and note in passing that in real FDTD applications the Yee (staggered) lattice is more common, as the conservation laws of EM fields are trivially hold on the Yee lattice. In the following we set the ratio of the spatial step to the temporal step equal to the speed of light so that . In FDTD, is called the Courant condition and is hold when the algorithm is stable. See A for a simplified discussion on stability.

We next express every term in Eq. (1) using finite differences. We use the “leapfrog” prescription that has accuracy (36):


In the code we call Eq. (3) a “square”, because the value at the center of a square is approximated using the values at its four corners. These discretization rules apply to the terms in the first line of Eq. (1) so that given the values at three corners, the value at the top right corner of a square can be solved. As for those terms in the second and the third lines of Eq. (1), hereafter referred to as the source terms (29), we need a different representation:


which we call a “bar”. The reason for using bars over squares will become clear shortly.

Figure 1: The schematic of the spacetime layout used in FDTD. Note that the lines locate in the blue region if , and that is at the center of the grid.
Parameter Description Mandatory Default
Nx Defined such that total grid points (to be solved) along is (so ) Yes n/a
Ny Total grid points along (so ) Yes n/a
nx ; needs to be an integer multiple of 2 and Yes n/a
Delta Step size (See also B) Yes n/a
k Driving frequency (in units of ) Yes 0
k1, k2 Incident frequencies for photon #1 (#2) (in units of ) No222Needed when init_cond=3 and identical_photons=0. 0
w0 2LS frequency (in units of ) Yes n/a
gamma 2LS decay rate (in units of ) Yes n/a
init_cond 1: two-photon plane wave; 2: stimulated emission (one-photon exponential wavepacket); 3: two-photon exponential wavepacket Yes 0
alpha Wavepacket width (in units of ) No333Needed when init_cond=2, and ineffective when init_cond=1. 0
alpha1, alpha2 Wavepacket width for photon #1 (#2) (in units of ) No2 0
identical_photons whether or not the two incident photons are the same No 1
save_chi Output as plain text No444These options cannot be simultaneously turned off (set to 0), or no output will be generated. 0
save_psi Output as plain text No4 0
save_psi_binary Output as binary No4 0
save_psi_square_integral Output as plain text No4 0
measure_NM See D No4,555Requires init_cond=2. 0
Tstep Output the wavefunctions for every temporal steps No 0
Nth Number of solvers No 1
Table 1: Summary of all input parameters accepted by our program.

Now we can put together squares and bars to discretize Eq. (1). For putting the problem on a computer, we also need to draw a “box” as we cannot walk through the entire spacetime indefinitely. As a result, we need to specify 4 parameters to define the box geometry: , , , and ; see Table 1 for the list of accepted input parameters. The corresponding layout is shown in Fig. 1. Note that (a) the initial condition is given on the line (the purple stripe); (b) the boundary condition is given for not just one line, as needed for solving ordinary (space-local and time-local) PDEs, but for a wide area (the green region) because of the delay and source terms; (c) in order to reach and to make the layout well-defined, we need to be an integer multiple of 2 and ; (d) the step size can be given arbitrarily, see B.

It is illustrative to see how the FDTD solves Eq. (1). Fig. 2 shows a snapshot of the FDTD solver which marches in a space-then-time manner. Since the delay PDE is chiral (unidirectional) (29), for a given time the solver (conceptually represented by the black cross) moves from the left edge of the box to the right, then advances one step in time and repeats. Each term in Eq. (1) has a different color for easy identification, and we use all previously solved values to solve for the top-right corner of the blue square, which is circled in red. Note that there are four colors, each of which has only two points (the bars), because when we Taylor-expand at the black cross, those terms are expanded at the center between the two points.

Figure 2: An example of calculating for the circled point on the square lattice. The slant lines represent light cones extended from the coupling points at . is chosen to be 4 for illustrative purposes, and the black cross denotes the Taylor-expansion point, referred to as the solver. The other colored points contribute to the point to be solved [each color corresponds to a term in Eq. (1)].

Furthermore, from Fig. 2 one can appreciate the fact that in general the system is highly non-Markovian. In terms of programming, this brings in a heavy burden because, for performance reasons, one can no longer flush earlier values from memory to disk, as typically done in solving ordinary PDEs. Instead, one needs to keep all grid points in the memory, so the hardware capacity is an important factor; doing frequent I/O is not an efficient option. For example, as we are solving a complex wavefunction, depending on the grid size it can be very memory-intensive (each grid point stores a complex double number and thus takes 16 bytes) and space-intensive (the wavefunction is written to a plain txt when the calculation is done). Readers should use the program with caution.666

A quick estimation for memory usage is roughly

(in GB), and for disk usage divided by .

Currently the FDTD program can solve three classes of problems, and for each class the boundary condition (green in Fig. 1) is given by known analytical expressions:

  1. Two-photon plane wave (29): the incoming photons are described by continuous wave, , and the initial condition for is simply (the 2LS is initially in its ground state). Therefore, we need to supply three more parameters that are physics-related: , , and ; see Table 1. In , the solution to Eq. (1) is


    where is the 2LS wavefunction, solved in the one-excitation sector assuming and ,


    where , and is the (lower) incomplete Gamma function (37). We note that evaluating on the complex plane is in general a non-trivial task; see C. The above expressions are used to generate both the initial and boundary conditions. Finally, we note that is set in the program for convenience.

  2. Stimulated emission (29): a single-photon exponential wavepacket of the form777The initial wavefront position is set at to shorten the computation time and to maximize interference effects, otherwise we need to wait for the wavepacket to arrive, and by then the qubit may already decay.


    is sent in, with the 2LS initially excited, so and . In this case, one also needs to specify the wavepacket width in the input file. In , the boundary condition is given by


    where is the 2LS wavefunction solved in the one-excitation sector with initial conditions and ,

  3. Two-photon exponential wavepacket (38): The corresponding initial conditions are and


    where is the normalization constant such that and is the -th photonic wavepacket, assumed of the form Eq. (7) with incident frequency and width . The normalization constant can be chosen to be positive without loss of generality:


    Following our standard procedure, we first solve for and then plug it into the FDTD code to solve in the region . We find that the solution is given by


    where is the qubit wavefunction in the one-excitation sector, solved subject to and :


    with . Note that when the two photons are identical, , the expression above reduces to the known form: [cf. Eq. (5)].

One could easily tweak the code to accept other kinds of initial conditions, but the strategy for solving Eq. (1) remains the same for all possible scenarios. It is important to note that due to the nature of the delay PDE (1), one cannot solve it numerically without knowing its analytical solutions in . But one do not need more than that either — the constraint guarantees that the knowledge in is sufficient, as it makes the line lie outside of the green region in Fig. 1 so that the boundary condition is completely determined.

Finally, to calculate various non-Markovian measures using the solution of , instead of post-processing it is much easier to do most of the work in situ. To construct the geometric measure (39), two functions and can be calculated (29).888In Ref. (29) is called and is called , respectively. The author regrets the inconvenience. The detail of the implementation is presented in D.

3 Parallelization

As discussed above, due to the non-local delay term there is a strong data dependency — the present solution depends on its past history. However, even with such a severely constrained problem, interestingly we find that the delay PDE (1) can still be solved in parallel, by which we mean that the simultaneous presence of multiple FDTD solvers in the spacetime is allowed. In this paper, we propose and implement two kinds of parallel approaches, referred to as the “swarm” and “wavefront” approaches, respectively, that respect the constraints imposed by the delay PDE (1).

Figure 3: A snapshot of multiple marching FDTD solvers in the swarm approach. Each color and shape corresponds to points read and written by a certain solver (thread). The points to be solved by each solver are empty, while known points are filled. Note that each solver is one step above and at least steps behind its predecessor (here ).

For the swarm approach (see Fig. 3), when solving Eq. (1) from the bottom up, each solver must be one temporal step above and at least spatial steps behind its next neighbor (called predecessor hereafter), or the execution order would not be preserved correctly; that is, one solver may read the value at certain point which is yet to be solved by another solver. It turns out that it is beneficial to have a “cyclic lag” relation among the solvers: denoting the total number of solvers as Nth, which can be set in the input file, then #2 is lagged behind #1, #3 behind #2, etc., and finally #1 can be set behind #Nth once it reaches the grid boundary. We find that such a wrap-around can increase performance and is easy for coding.

In the current implementation, we utilize the “wait-and-signal” mechanism provided by POSIX Threads (pthreads). Each solver is marched by an independent thread and has a lock for its position and a condition variable for waiting for its predecessor. Before attempting to solve at a certain point, it must look up its predecessor’s position and see whether the constraint is satisfied or not. If not, then it must wait for the signal sent by its predecessor before proceeding. Using pthreads’ locks not only allows a solver to communicate with and wait for its predecessor, but also preserves cache coherence, which is important since very often a solver needs to access a value immediately after it’s solved by another solver.

Figure 4: Schematic of the wavefront approach. We first solve in the region (top), then (middle), and finally (bottom). In the first and last regions, points on the same wavefront (represented by dotted lines) can be solved in parallel. In the middle region, due to causality only one solver is permitted to enter and move row by row. The solver(s) are propagated along the dashed lines in all regions. The colors and shapes of all points have the same meaning as in Fig. 3.

For the wavefront approach (see Fig. 4), we circumvent the requirement of separating steps in space by partitioning the spacetime into three regions: , , and , which are solved in turn. In the first and last regions, Eq. (1) can be solved by propagating the wavefront along the diagonal, and points on the same wavefront (along which the spatial lag between adjacent solvers is only one step) can be solved in parallel. In the middle region , which is -step wide, due to causality we use only one solver (as if no parallelization exists). Physically, it is clear (29) that in () we only have right- (left-) going photons, and in they are connected by the mirror.

We implement the wavefront approach using OpenMP, since within each wavefront it is simply an embarrassingly parallel problem; that is, there is no data dependency and thus no need to lock or signal. Moreover, because most of the time a wavefront typically contains at least hundreds of points, the synchronization penalty (due to the implicit barrier in the OpenMP constructs) is small compared to the gain from parallelization, and the barrier in turn helps preserve the cache coherence.

In the end of the next section, we present a strong-scaling measurement for a typical problem size required by the physics (29; 40; 38). It will be shown that the wavefront approach outperforms the swarm approach.

4 Validation

User instruction for the program is given in the README.md file distributed along with the code. We have tested our program on both Linux and Mac OS X. There are at least three possible tests to pass for establishing the validity of this code: 1. in , where the full, exact solution can be obtained; 2. in because we can solve for the first few triangular tiles analytically (29); 3. the two-photon wavefunction , because the steady-state result is known from scattering theory (8).

For the first test, in Fig. 5 we show several snapshots of the absolute and relative errors of the real part of as a function of steps in the -direction (and in all plots). We note that in order to reach the steady state, usually the termination time needs to satisfy . In addition, note two common features in these plots: (a) the error is zero in the initial steps because it’s where the boundary condition resides; (b) errors are well-controlled in the sense that they are bounded in an oscillating envelope, so the program behaves correctly in .

Figure 5: Absolute (left) and relative (right) errors for as a function of spatial steps (in units of ) at (from top to bottom). The last step corresponds to . Input parameters: .
Figure 6: Comparison of calculated two-photon wavefunction (red curves) as a function of photon separation in the long-time limit for and (left) (right) . The blue dots are from the numerically exact scattering theory (8). Input parameters: .
Figure 7: Comparison of calculated two-photon wavefunction (red curves) as a function of photon separation in the long-time limit for and (left) (right) . The blue dots are from the numerically exact scattering theory (8). Input parameters: .
Figure 8: Comparison of calculated two-photon wavefunction (red curves) as a function of photon separation in the long-time limit for and (left) (right) . The blue dots are from the numerically exact scattering theory (8). Input parameters: .

For the second test, we generated the analytical expressions of in the first four tiles (the fifth one took too much time to compute) in Mathematica (29) and then compared the values on the FDTD grid points. The numerical result agreed well too (not shown).

Finally, to compare with the scattering theory we can construct the two-photon wavefunction using the calculated according to the formal solution of , Eq. (2), where the coordinates and should be chosen as for capturing the outgoing fields. In the program, is calculated by setting and , with being the separation of the two detectors.999We note that by definition the two-photon correlation function is given by with () being the position of the first (second) detector. The results are shown in Figs. 6-8. The agreement is quite well, even in the non-Markovian regime in which (Fig. 8). Because evaluating Eq. (2) requires the full history of , it is clear the program is valid for all regions in the spacetime. More results generated by our FDTD program are discussed in Refs. (29; 40; 38).

Figure 9: Log-log plot of measured elapsed time as a function of the number of solvers (threads) set by Nth using the swarm (orange) and the wavefront (yellow) approaches. The test is run 10 times for each case. For single thread, the measurements without threading overhead (light blue circles) are also shown. The straight lines are fitted by all data points with excluded. Input parameters: , , , , init_cond, , , and . Test environment: Intel Xeon CPU E5-2670 2.6 GHz with 16 physical cores, 128 GB physical memory, and gcc 7.2.0.

We next comment briefly on the performance of multi-thread support. In Fig. 9 we report the elapsed time as a function of the number of threads (specified by Nth) for solving Eq. (1) using both the swarm and wavefront approaches for the same input parameters, which are chosen to roughly match those used in Refs. (29; 40; 38). In order to have a fair comparison, note that when single thread is used, the threading libraries would impose unnecessary overhead, which is enormous in particular for the wavefront approach since after every step a barrier synchronization is enforced. Therefore, we also report in Fig. 9 the measurement without using any threading mechanism (light blue circles), compared with which the speedup is close to 2x using the wavefront approach with 16 threads. We find that the wavefront approach performs much better than its swarm counterpart: it has a better scaling and is already faster than single thread with only 8 threads. However, both approaches do provide shorter runtime than that of the serial version. Finally, we note in passing that the test is performed on a machine with hyperthreading turned off, so we are unable to test with more threads, but we expect to see a consistent scaling behavior when running on machines with physical or logical cores.

5 Conclusion

The program provides a numerically exact, time-dependent solution to the problem of a single 2LS placed in front of a mirror scattered by either a one- or two-photon initial state, so for waveguide-QED researchers this program can be very useful for solving the three classes of problems as presented above. Arbitrary wavepackets or other forms of initial conditions can be incorporated into the code with minor modification.

More flexibility, such as a giant atom that has arbitrary number of distant legs coupled to the waveguide (41; 42; 20), can be gained by re-implementing this program in an OO language (C++, Python, etc). In fact, the code has been written in the OO-style by (i) compacting all relevant fields into a C struct (roughly equivalent to a C++/Python class) named grid, (ii) passing “this” pointer to a grid instance when calling functions, as if they were member functions of the grid class, and (iii) following RAII. As a result, rewriting in C++, for example, should be of minimal work. Of course, the grid layout must be carefully designed in such a scenario to minimize memory usage. As for the case of an infinite waveguide, we believe the best design is to use MPI for communication between the left- and right-moving solvers, which we leave for future work.

Finally, this program serves as a proof of concept for mathematicians, scientists and engineers who are interested in solving complicated delay PDE using parallel FDTD. Important issues such as general proof of stability, more efficient multi-thread implementation, robust optimization over any 1+1D delay PDE, etc., remain challenging. For development discussions, please either open an issue on GitHub101010https://github.com/leofang/FDTD or contact the authors. Users and developers who are benefited from this program are strongly encouraged to cite both this paper as well as Ref. (29).

6 Acknowledgments

We acknowledge financial support from U.S. NSF (Grant No. PHY-14-04125) and the Brookhaven National Laboratory’s Laboratory Directed Research and Development project #17-029, and fruitful discussion with Harold U. Baranger, Francesco Ciccarello and Meifeng Lin. We also thank Weiguo Yin for proofreading. Part of the tests used resources of the Center for Functional Nanomaterials, which is a U.S. DOE Office of Science User Facility, at Brookhaven National Laboratory under Contract No. DE-SC0012704.

Appendix A Stability analysis

Following Ref. (36), we can perform a simple von Neumann stability analysis to check that our regularization scheme does not lead to amplitude divergence. First, let us consider the simplest ordinary PDE [compared with Eq. (1), ]. If we write , where is called the amplification factor, then we have


It can be shown that as long as , so our approach for solving this simple PDE is stable.

For Eq. (1) in , we can again insert the ansatz and obtain (assuming for simplicity)


Note that the parameter enters because of the delay term. Now this is a polynomial in of order , so there is no general solution for its roots. But empirically (i.e. by solving the equation graphically) we still find that ; that is, our algorithm is stable in .

A more general stability analysis for Eq. (1) is arduous. However, we hope the above reasoning together with the validations presented above are enough to convince the interested users that our FDTD program is stable.

Appendix B Conversions between physical quantities and simulation parameters

In the scattering problem, the three important parameters are (), () and (). To satisfy the rotating-wave approximation is required, but how large is should not matter. Once they are determined, the physics is determined. Here we describe how to express physical quantities in terms of , , , and :


A Python script is used to prepare the input parameters using the above relations.

We remark the role of , which represents both length and time in dimensional analysis (recall

). Thus, we have a degree of freedom (dof) to choose its value without affecting the physics, as if we were changing the length unit. We emphasize this notion because there is in fact a deeper reason for having this dof: Eq. (

1) is a wave equation, and thus scale invariant (43). It is well-known that FDTD is quite suitable for scale-invariant problems, and the step size is just a conceptual quantity for proper discretization and is irrelevant of the physics. What matters is the dimensionless quantities such as and , not itself.

The current implementation is not yet strictly scale invariant — we still explicitly keep as a mandatory parameter, whose value does not affect the result (providing rounding errors are negligible), but it is not necessary if the code is implemented in a different way. What really controls the (relative) step size, and therefore the error, is the ratio of ; see Eq. (19). As a result, we need to have many steps per wavelength. Empirically we find is desirable.

Appendix C Evaluating incomplete Gamma functions on the complex plane

In the present work, the incomplete Gamma function naturally emerges from our equations. While there are several open-sourced math libraries such as GSL and Boost that implement the real-valued version, evaluating for complex-valued argument is a very tricky task and to our knowledge no open-sourced code provides such a routine. Commercial softwares like Mathematica does provide one for arbitrary arguments, but its efficiency is nonideal for simulation purposes. As a by-product, therefore, we provide an implementation of the incomplete Gamma function for nonzero positive integers , and complex-valued .

Figure 10: Evaluating for complex-valued . Depending on , we use different formulae, labeled from (a) to (d), to achieve fast and accurate convergence. Note that (c) and (d) are employed when and .

Specifically, we implement the normalized lower incomplete Gamma function (37)


Other members in the incomplete-Gamma family can be easily obtained if is known. The most challenging part is when is on the negative real axis. Fortunately, this problem is recently tackled in Ref. (44), and we incorporate their findings into our implementation, which works well even when has a tiny, nonzero imaginary part. Our implementation is summarized in Fig. 10. We evaluate according to four different expressions for in different regions on the complex plane:

  1. Continuous fraction; see Eq. (6.2.7) in Ref. (36):

  2. Series expansion; see Eq. (6.2.5) in Ref. (36):

  3. Series expansion for ; see Eq. (6) in Ref. (44):

  4. Poincaré-type expansion; see Eq. (29) in Ref. (44):


We note that combining (a) and (b) gives the standard approach to real-valued (36), and that for very large the computation may not converge (44), but the convergence range is large enough for our purposes.

Appendix D Non-Markovian measures

In this Appendix we illustrate the computation of the non-Markovian (NM) geometric measure (39) quantifying the single-photon scattering process in this system. Specifically, we consider initially the waveguide has a single-photon exponential wavepacket of the form Eq. (7), and calculate two functions, and ,8 for constructing the geometric measure. The detailed discussion is reported elsewhere (29), and here we simply quote the results. The computation can be performed by setting an arbitrarily positive alpha, init_cond=2, and measure_NM=1 in the input file.

Denoting as the photon wavefunction in the one-excitation sector, we can define a complex function as


Similarly, we define a real function as


where is given by Eq. (13). After calculating , the program will compute , , and [Eq. (9)], and output the results as plain text. The geometric measure is defined as (39)


and elsewhere (29) we show that for our system . Therefore, once the two functions and are known, the geometric measure can be calculated easily.111111 and can also be used to construct other NM measures; detail in progress will be reported elsewhere.

Figure 11: Spacetime layout of the FDTD output files. Information for is not written to the disk in order to reduce the file size. During post processing, one can load data for every 100 steps into memory (blue stripes). The red dashed line represents the light cone extended from the second 2LS at , and the time at which it intersects with the box boundary is denoted .

For users interested in computing the two functions themselves, we note that the spatial integrals in Eqs. (25) and (26) need some care, which is explained in the rest of this section. (In our code they are computed according to the trapezoidal rule.) One of the issues for numerically computing these integrals is that the amount of data generated by the FDTD program can be huge, as previously explained. It is certainly inconvenient to write data into a file and subsequently load them into the memory for post processing.

We suggest a manageable way to reduce the disk and memory usages. First, note that the analytical solution in is known, so do not write data in this region into files. The resulting data layout is shown in Fig. 11. Doing this will cut half of the disk usage (compare Fig. 11 with Fig. 1). Next, depending on the system time scale, one can selectively load data for every, say, 100 steps (blue stripes in Fig. 11; set Tstep=99). This will significantly reduce the memory usage. One can keep loading data into the memory until reaching , the max simulation time in FDTD, but for the purpose of calculating the integrals, one may only use data up to , where


The reason is that the time is where the second light cone intersects with the box boundary. If data beyond this limit is used, the wavefront will go outside of the box and the integrals would be underestimated. Thus, instead of positive infinity, numerically the upper bound for those integrals should be the -coordinate that intersects with at the 2nd light cone. Finally, we note that Tstep

should be wisely chosen so that after interpolation the output data can faithfully represent the result.