Introduction
One of the main consequences of the revolution in computation sciences, started by Alan Turing, Konrad Zuse and John Von Neumann, among others [1, 2]
, is that computers are capable of substituting us and improving our performance in an increasing number of tasks. This is due to the advances in the development of complex algorithms and the technological refinement allowing for faster processing and larger storage. One of the goals in this area, in the frame of bioinspired technologies, is the design of algorithms that provide computers humanlike capacities such as image and speech recognition, as well as preliminary steps in some aspects related to creativity. These achievements would enable us to interact with computers in a more efficient manner. This research, together with other similar projects, is carried out in the field of artificial intelligence
[3]. In particular, researchers in the area of machine learning (ML) inside artificial intelligence are devoted to the design of algorithms responsible of training the machine with data, such that it is able to find a given optimal relation according to specified criteria [4]. More precisely, ML is divided in three main lines depending on the nature of the protocol. In supervised learning, the goal is to teach the machine a known function without explicitly introducing it in its code. In unsupervised learning, the goal is that the machine develops the ability to classify data by grouping it in different subsets depending on its characteristics. In reinforcement learning, the goal is that the machine selects a sequence of actions depending on its interaction with an environment for an optimal transition from the initial to the final state.
The previous ML techniques have also been studied in the quantum regime in a field called quantum machine learning [5, 6, 7, 8, 9, 10, 11, 12]
with two main motivations. The first one is to exploit the promised speedup of quantum protocols for improving the already existing classical ones. The second one is to develop unique quantum machine learning protocols for combining them with other quantum computational tasks. Apart from quantum machine learning, fields like quantum neural networks, or the more general quantum artificial intelligence, have also addressed similar problems
[13, 14, 15, 16, 17].Here, we introduce a quantum machine learning algorithm for finding the optimal control state of a multitask controlled unitary operation. It is based on a sequentiallyapplied timedelayed equation that allows one to implement feedbackdriven dynamics without the need of intermediate measurements. The purely quantum encoding permits to speedup the training process by evaluating all possible choices in parallel. Finally, we analyze the performance of the algorithm comparing the ideal solution with the one obtained by the algorithm.
Results
Quantum Machine Learning Algorithm
The first step in the description of the algorithm is the definition of the concept of multitask controlled unitary operations . In essence, these do not differ from ordinary controlled operations, but the multitask label is selected to emphasize that more than two operations are in principle possible. acts on , being
, a quantum state belonging to the tensor product of the control,
, and target, , Hilbert spaces. The dimension of both subspaces is the same, , and depends on the particular problem to address. Mathematically, we define as(1) 
where denotes the control state, and is the reduced or effective unitary operation that performs on the target subspace when the control is on .
The goal of our algorithm is to explore the control subspace and find the control state that maximizes the implementation of a known , , which is given in terms of the and states as . Therefore, our algorithm is appropriate when is experimentally implementable but its internal structure, the relation between and , is unknown. In other words, our algorithm enables the training of the control subspace by providing data about the target subspace , in order to achieve that the complete system implements the desired operation in the target subspace . Our inspirations for the model of controlled unitary operations are supervised learning protocols, in which the goal is that the system is able to learn a given known function. Here, the control subspace plays the role of the memory of the system. This control, or memory, is the mechanism by which the system is able to store the information about the operation that it has to implement. The idea of our algorithm is that the user transmits the information of the operation the system has to make. Therefore, the goal is not to perform a given gate, but to save this information in the system.
The protocol consists in sequentially reapplying the same dynamics in such a way that the initial state in the target subspace is always , while the initial state in the control subspace is the output of the previous cycle. The equation modeling the dynamics is
(2) 
In this equation is the Heaviside function, is the Hamiltonian giving rise to with , and is the Hamiltonian connecting the input and output states, with and the coupling constants of each Hamiltonian. We point out that this evolution cannot be realized with ordinary unitary or dissipative techniques. Nevertheless, recent studies in time delayed equations provide all the ingredients for the implementation of this kind of processes [18, 19, 20, 21]. Up to future experimental analyses involving the scalability of the presented examples, the inclusion of time delayed terms in the evolution equation is a realistic approach in the technological framework provided by current quantum platforms. Another important feature of Eq. (2), which is related with the delayed term, is that it only acquires physical meaning once the output is normalized.
Regarding the behavior of the equation, each term has a specific role in the learning algorithm. The mechanism is inspired in the most intuitive classical technique for solving this problem, which is the comparison between the input and output states together with the correspondent modification of the control state. Here, the first Hamiltonian produces while the second Hamiltonian produces the reward by populating the control states responsible of the desired modification of the target subspace. The structure of guarantees that only the population in the control associated with the optimal is increased,
(3) 
Notice that while this Hamiltonian does not contain explicit information about , the solution of the problem, its multiplication with the feedback term, , is responsible for introducing the reward as an intrinsic part of the dynamics. This is a convenient approach because it eliminates the measurements required during the training phase. In this case where we employ a single pair of , } target states, is fixed and time independent. However, this could change in a more complex situation of pairs of , target states, such that , where would also be time independent but different in each episode. Even if this generalization is not included in this article, it points out a promising direction for enhancing the protocol.
We would also like to remark the similarity existing between the effect of the delay term in our quantum evolution and gradient ascent techniques in algorithms for optimization problems [3]. A possible strategy to perform the learning protocol would be to feed the system with random control states, measure each result, and combine them to obtain the final solution. However, we have discovered that it suffices to initialize the control subspace in a superposition of the elements of the basis. We would like to remark that this purely quantum feature reduces significantly the required resources, because a single initial state replaces a set of random states large enough to cover all possible solutions.
Numerical Simulations
We have numerically tested our proposed algorithm in a selection of examples covering the cases with unique or multiple solutions, as well as higherdimensional systems. We consider as a figure of merit the fidelity function given by the trace of the product between the control state obtained by the algorithm and the ideal control state. In order to recover the solution of the problem we need to trace out the target degrees of freedom, obtaining a density matrix. Therefore, the iteration of the protocol would require solving Eq. (
2) written for density matrices. This turns out to be a nontrivial task given the nonlocal cross terms of the generalized master equation, that reads,(4) 
To achieve the solution in the most efficient way, we have decomposed each density matrix in a convex sum of pure states and solved the vector equation in Eq. (
2) for each of them separately, retrieving the total solution as a linear convex superposition of the individual ones. This method is consistent due to the linearity of Eq. (4).Definition of the SWAP gate problem
A first specific example we address in this manuscript is given by the excitation transport produced by the controlled SWAP gate. In this scenario, the complete system is an
node network, where each node is composed by a control and a target qubit. Therefore, the control and target subspaces are defined as
and . The excitations in this system belong to the target subspace and are exchanged between two nodes, when both nodes are in a particular state of the control subspace. The control states are in a superposition of open and close, and , while the target qubits are written in the standard basis denoting the absence or presence of excitations. We define , the multitask controlled unitary operation, to implement the SWAP gate between connected nodes only if all the controls of the corresponding nodes are in the open state, . See Fig. 1 for a graphical representation of the most simple cases, the two and three node line networks. The explicit formula for is given by(5) 
where represents the SWAP gate between qubits and . Here, the first two qubits represent the control subspace and the last two represent the target subspace. Although we have employed unitary operations for illustration purposes, the equation requires the translation to Hamiltonians. In order to do so, we first select to be and calculate the matrix logarithm, which yields the result for in Eq. (2), . Denoting with the Pauli matrices, for reads
(6) 
Unique solution of the quantum machine learning algorithm
The first family of problems we address is nnode line networks, in which the nodes are located in a unidimensional array and are only connected with their closest neighbors. The goal is to find the control state that allows transmitting an excitation from the first to the last node of the network, which in this case requires that all intermediate connections are active. The pair of is determined by these constrains as and . Accordingly the problem has a unique solution, given by the control state with all the nodes open, . The parameters we have selected are , , and , where represents the total duration of each episode. In Fig. 2 we plot the results together with the required resources. These examples show how the algorithm is properly working for this family of problems independently of the natural basis of . The Hamiltonians employed in the simulations for are given by
(7) 
Multiple solutions of the quantum machine learning algorithm
We address now a set of more complicated networks which will allow us to clarify how the algorithm performs when solving problems with multiple solutions. These are the A network for three nodes and the B and C networks for four nodes, depicted in Fig. 3. The goal of the algorithm is the same as in the previous case, i.e., to find the control state able of sending an excitation from the first to the last node. The difference is that these networks accept two pure states and their superpositions as solutions, a feature that is reflected in the result obtained with the algorithm. The asymptotic state achieved under the feedback induced quantum learning equation is a quantum superposition of both solutions, see Fig. 4a for the numerical simulations. In this case, the previous definition of the fidelity is not valid. Therefore, we provide a new one in terms of the and states of the target space and the Hamiltonian . The new fidelity corresponds to the trace of the product between the ideal output , and the output obtained with the control state achieved by the algorithm after acting on . Both ideal and real outputs belong to the target subspace. While the {, } pair is the same as in the previous case, the Hamiltonians change their definition to
(8) 
For the cases studied, the complete set of solutions is obtained encoded in the outcome of the algorithm. This is convenient because it allows one to design a protocol to select a specific optimal solution according to given criteria. In the networks we are analyzing, one might want to obtain the most efficient solution, defining efficiency as achieving the transmission of the excitation while minimizing the number of open nodes. In order to accomplish this task a dissipative term has to be included in the evolution equation, in order to filter out the undesired solutions. We point out that a controldependent dissipation affects the target subspace, modifying the protocol in the required manner. We explicitly write the Lindblad operators and dissipation constants for a twonode case, as follows
(9) 
Instead of solving the master equation, we have employed the quantum jump formalism, which allows one to work with Eq. (2) instead of Eq. (4), with the consequent simplicity. The dissipation can be modeled with an additional term in the first part of the time delayed equation in the absence of a decay event. Therefore, in order to assure that the nonHermitian Hamiltonian accounts for the real evolution of the system, one has to properly balance the relation between and .
(10) 
A nondissipative alternative consists in the modification of the coupling constant associated with each of the controltarget pairs in the unitary operation. These two techniques allow us to find the shortest path between two nodes in a network once the natural basis of the unitary is known.
Extension to qudits
Another possible aspect to study is the extension of the algorithm to higherdimensional building blocks. We provide an example in which the optimal control state for a multitask controlled unitary operation acting on qutrits is obtained. This operation is defined in terms of the control states as
(11) 
where the first qutrit belongs to the control subspace and the second one belongs to the target subspace. Although no network is defined in this case, the goal of the algorithm is to find the control state that realizes the  transition in the target subspace. In this problem, the system consists of a single control qutrit and a single target qutrit. See Fig. 4 for a numerical simulation of the learning process in this particular case.
Extension to phase gates
All examples discussed until this point consisted on gates whose effect can be understood as a permutation of the basis elements. Let us consider now a different scenario in which the operations in the target subspace are phase gates, , therefore, the complete unitary operation reads with . If we choose the reference target states to be,
(12) 
we know a priori that the only solution is given by associated to control . We perform a numerical experiment to analyze how the initial equally weighted control state, , converges to the solution under the action of depending on the dimension of the system. See Fig. 5 for the simulations. The results show that our algorithm is particularly efficient for this selection of Hamiltonians, given that the solution is reached in for all the cases studied.
Efficiency of the Quantum Machine Learning Algorithm
It is important to mention that the simulations and techniques we provide here constitute an analysis of our quantum machine learning algorithm, but our aim in this work is not to demonstrate scalability or quantum speedup. It would be convenient to analytically solve Eq. (2) in order to rigorously analyze the scope of the algorithm and be able to obtain information about its scalability for general problems. Since we have not solved the dynamics analytically, we evaluate the performance by comparing our results with the ones obtained via different methods. In particular, we follow two different strategies to determine the structure of the controlled unitary operation, measure it and analyze it by using machine learning techniques. Here, the resources are quantified by the number of times the unitary operation has to be applied and the output measured in order to be able to determine its structure.
Machine Learning
Here, we employ stateoftheart classical machine learning algorithms to compare with our quantum protocol. We show the results achieved for three different networks, the twonode line, and two different instances of the threenode line, all of them previously studied with our algorithm in Fig. . The numerical experiment is designed for determining the optimal control state by evaluating the action of on the tensor product of a random control state and the fixed . The data consists of a recompilation of random control states, which cover the whole control subspace, with their correspondent fidelity for a fixed pair.
For each network, three data sets were used (small, medium, large) with a different number of instances. It must be emphasized that all results are referred to test sets, i.e., obtained with data not used to train the models. Therefore, they must be taken as a good estimation of the prediction capability of the models for new unseen data. Crossvalidation was implemented by means of a
fold approach [4], where =10 for all data sets, except for the small data set of the twoline network whose value was =5 due to the very limited number of instances.All results were achieved by using Support Vector Regressors (SVRs) [22], whose characteristics make them especially adequate when dealing with sparse data sets (few instances and high dimension). SVRs work by creating a transformed data space in which the problem is more easily solvable (ideally the problem is transformed into a linear one). That transformation between spaces is carried out by the socalled kernels (Gaussian and polynomial kernels have been used in this experimentation). The data used for training the models has been randomly selected from a set of multiple pairs of control state and fidelity. Although other ML approaches, such as Reinforcement Learning (RL), might seem appropriate to solve this problem, note that the goal of the problem is actually a prediction of the efficiency of the solution rather than the optimal sequence of steps that link the input state with the output state, thus not matching the RL paradigm
Tables 1, 2 and 3 report the results achieved by the SVR in the three analyzed networks. These correspond to the two and three node lines analyzed in Fig. 2. In the case of , the topology of networks A and B is the same, the one depicted in Fig. 1., but they are defined in a different control basis. For each case, the state with the best fidelity is shown, together with the Mean Error (ME) and the Root Mean Square Error (RMSE). ME is a measure of bias that represents the difference between the real and the predicted efficiencies, i.e., gives information about whether the model tends to make overestimations (negative values) or underestimations (positive values). On the other hand, RMSE is a wellknown robust measure of accuracy.
Number of Instances  Small (10)  Medium (75)  Large (500) 

ME  0.0029  1.3  8.6 
RMSE  0.0493  0.0012  0.0026 
Best Fidelity  0.874  0.962  0.987 
Number of Instances  Small (50)  Medium (200)  Large (1000) 

ME  7.2  2.4  3.2 
RMSE  0.0054  0.0017  0.0039 
Best Fidelity  0.6840  0.8836  0.8872 
Number of Instances  Small (50)  Medium (200)  Large (1000) 

ME  9.3  7.8  9.6 
RMSE  0.0082  0.0018  0.0014 
Best Fidelity  0.9227  0.9188  0.9709 
Measurement of the Unitary Operation
An alternative method for solving the learning task would be to measure the inputoutput relation of the controlled unitary operation when strategically, and not randomly, exploring the control subspace. Let us denote by the natural basis of the control subspace in , and by our guess for this basis in a Hilbert space of dimension . The measurement protocol consists in applying the unitary operation to , projecting this result on and tracing out the target subspace achieving for each . In the worst case, this operation has to be repeated for all to guarantee that the populations of the solutions, and not the internal phases, are found. Afterwards, one has to find the appropriate basis as a linear combination of the proposed one . Another approach is to determine each component of the unitary operation and change to a basis in which the unitary is expressed as a direct sum of the operations. This particular strategy highlights the relation between our algorithm and the field of quantum process tomography.
Comparison
In summary, the purely random approach analyzed with ML techniques requires in principle more resources than the quantum feedback algorithm with delayed equation. Nevertheless, the fact that ML techniques are independent of the basis guarantees their success in any possible situation. The comparison is made between the episodes, the number of times that the time delayed equation has to be repeated, and the instances, the amount of data employed in the ML algorithm. Even if both methods are based on different training mechanisms, the information fed to both of them is the same, a figure of merit for each control state. In the SVR the system is provided with pairs of control state and its correspondent fidelity, which requires the implicit knowledge of and the ideal operation. The connection with the quantum algorithm is that the delay term in Eq. provides a distance that works in an analogue way as the fidelity in the SVR. Notice that in the quantum algorithm each episode only requires a pair of {,
} states, therefore the number of episodes equals the number of instances. A more realistic analysis would take into account the duration of each process, but for the moment we cannot make a precise estimation about the time for implementing a time delayed equation.
With respect to the complete measurement approach, recent studies bound its scalability in the order of or even , being the latter the dimension of the Hilbert space [23, 24, 25]. On the other hand, the measurement protocol does not provide the solution in a physical register, but it is the analysis of the unitary operation that provides the knowledge of it. Moreover, each implementation of the controlled unitary operation is associated with a measurement, while in the quantum machine learning algorithm intermediate measurements are not required, because they are included as an intrinsic part of the dynamics, in contrast to the tomography approach. Additionally, when measuring, one needs to perform a search for the convenient basis along the Hilbert space to retrieve the correct structure of .
Regarding the scalability of our algorithm, we have observed that the number of episodes for reaching the solution depends on the distance between both, the initial control state and the solution. A direct consequence is that the protocol will not properly work when the initial control state is orthogonal to the solution. This is important to consider because the way to notice the failure is to validate the result by measuring the outcome of the unitary operation. In the simulations carried out here, we have employed as the initial control state, but this choice is not unique. In some sense, our protocol can also be understood as a search algorithm. Therefore, a comparison with Grover’s result [26] may be in order. Regarding the similarities, the conditional phase rotation in Grover’s search algorithm requires the use of an oracle, whose role is played in our formalism by the combination of a controlled unitary operation and the timedelayed terms. On the other hand, the main difference between both protocols is that on Grover’s algorithm the basis in which the states to optimize are described is known, while in ours, the search is performed without previous knowledge of the basis, in a similar spirit to the analog algorithm by Farhi and Gutmann [27]. A positive property of our protocol, in contrast with the previously mentioned quantum search algorithms, is that the solution is reached asymptotically, i.e., the fidelity always increases with the number of episodes.
Discussion
In conclusion, we have proposed a quantum machine learning algorithm in which the implementation of timedelayed dynamics allows one to avoid the intermediate measurements, and therefore provides a complementary strategy to conventional quantum machine learning algorithms [28, 29, 30, 31]. Moreover, we have shown how the framework of multitask controlled unitary operations is flexible enough to address different problems such as efficient excitation transport in networks. This kind of protocol may be straightforwardly adapted to different quantum architectures, which is beyond the scope of this article. We believe our study represents the first proposal for exploiting feedbackinduced effects of delayedequation dynamics without intermediate measurements in quantum machine learning algorithms.
References
 [1] Turing, A. On computable numbers, with an application to the Entscheidungsproblem. Proc. London Math. Soc. 42, 230 (1936).

[2]
Sipser, M.
Introduction to the Theory of Computation
(Cengage Learning, 2012).  [3] Russell, S. & Norvig, P. Artificial Intelligence (Pearson, 2010).
 [4] Alpaydin, E. Introduction to Machine Learning (MIT, Cambridge, 2014).

[5]
Paparo, G. D., Dunjko, V., Makmal, A., MartinDelgado, M. A. & Briegel, H. J. Quantum speedup for active learning agents,
Phys. Rev. X 4, 031002 (2014).  [6] Adcock, J. C. et al. Advances in quantum machine learning. arXiv 1512.02900 (2015).
 [7] Schuld, M., Sinayskiy, I. & Petruccione, F. An introduction to quantum machine learning. Contemp. Phys. 56, 172 (2015).
 [8] Biamonte, J. et al. Quantum Machine Learning. Nature 549, 195 (2017).
 [9] Wittek, P. Quantum Machine Learning (Academic Press, San Diego, 2014).

[10]
Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum Support Vector Machine for Big Data Classification.
Phys. Rev. Lett. 113, 130503 (2014).  [11] Dunjko, V., Taylor, J. M. & Briegel, H. J. QuantumEnhanced Machine Learning. Phys. Rev. Lett. 117, 130501 (2016).
 [12] Sentís, G., Calsamiglia, J., MuñozTapia, R. & Bagan, E. Quantum learning without quantum memory. Sci. Rep. 2, 708 (2012).
 [13] Ezhov, A. A. & Ventura, D. Quantum Neural Networks, in Future directions for intelligent systems and information sciences. (PhysicaVerlag HD, 2000).
 [14] Gupta, S. & Zia, R. K. P. Quantum Neural Networks. J. Comput. System Sci. 63, 355 (2001).
 [15] Altaisky, M. V. Quantum neural network, arXiv quantph/0107012 (2001).
 [16] Schuld, M., Sinayskiy, M. I. & Petruccione, F. The quest for a Quantum Neural Network. Quantum Inf. Process. 13, 2567 (2014).
 [17] Wiebe, N., Kapoor, A. & Svore, K. M. Quantum Perceptron Models. arXiv 1602.04799 (2016).
 [18] Grimsmo, A. L. TimeDelayed Quantum Feedback Control. Phys. Rev. Lett.115, 060402 (2015).
 [19] Whalen, S. Open Quantum Systems with TimeDelayed Interactions. PhD Thesis, University of Auckland (2015).
 [20] Pichler, H. & Zoller, P. Photonic Circuits with Time Delays and Quantum Feedback. Phys. Rev. Lett. 116, 093601 (2016).
 [21] AlvarezRodriguez, U. et al. AdvancedRetarded Differential Equations in Quantum Photonic Systems. Sci. Rep. 7, 42933 (2017).
 [22] Schölkopf, B. & Smola, A. J. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond (MIT Press, Cambridge, 2002).
 [23] Gutoski, G. & Johnston, N. Process tomography for unitary quantum channels. J. Math. Phys. 55, 032201 (2014).
 [24] Baldwin, C. H., Kalev, A. & Deutsch, I. H. Quantum process tomography of unitary and nearunitary maps. Phys. Rev. A 90, 012110 (2014).
 [25] Holzäpfel, M., Baumgratz, T., Cramer, M. & Plenio, M. B. Scalable reconstruction of unitary processes and Hamiltonians. Phys. Rev. A 91, 042129 (2015).
 [26] Grover, L. K. A fast quantum mechanical algorithm for database search. Phys. Proceedings, 28th Annual ACM Symposium on the Theory of Computing (STOC), 212 (1996).
 [27] Farhi, E. & Gutmann, S. Analog analogue of a digital quantum computation. Phys. Rev. A 57, 2403 (1998).
 [28] Lamata, L. AlvarezRodriguez, U., MartinGuerrero, J. D., Sanz, M. & Solano, E. Quantum Autoencoders via Quantum Adders with Genetic Algorithms, arXiv 1709.07409 (2017).
 [29] Lamata, L. Basic protocols in quantum reinforcement learning with superconducting circuits, Sci. Rep. 7, 1609 (2017).

[30]
Benedetti, M., RealpeGómez, J., Biswas, R. & PerdomoOrtiz, A. Estimation of effective temperatures in quantum annealers for sampling applications: A case study with possible applications in deep learning,
Phys. Rev. A 94, 022308 (2016).  [31] Dunjko, V. & Briegel, H. J. Machine learning & artificial intelligence in the quantum domain. arXiv 1709.02779 (2017).
Acknowledgements
The authors acknowledge support from Basque Government grants BFI2012322 and IT98616, Spanish MINECO/FEDER FIS201569983P, Ramón y Cajal Grant RYC201211391, and UPV/EHU UFI 11/55.
Author Contribution
U. AlvarezRodriguez, L. Lamata and E. Solano designed the time delayed equation, while P. EscandellMontero and J. D. MartínGuerrero analyzed the problem with SVR techniques.
Additional information
The authors declare no competing financial interests.
Comments
There are no comments yet.