Measurement-based adaptation protocol with quantum reinforcement learning
Machine learning employs dynamical algorithms that mimic the human capacity to learn, where the reinforcement learning ones are among the most similar to humans in this respect. On the other hand, adaptability is an essential aspect to perform any task efficiently in a changing environment, and it is fundamental for many purposes, such as natural selection. Here, we propose an algorithm based on successive measurements to adapt one quantum state to a reference unknown state, in the sense of achieving maximum overlap. The protocol naturally provides many identical copies of the reference state, such that in each measurement iteration more information about it is obtained. In our protocol, we consider a system composed of three parts, the "environment" system, which provides the reference state copies; the register, which is an auxiliary subsystem that interacts with the environment to acquire information from it; and the agent, which corresponds to the quantum state that is adapted by digital feedback with input corresponding to the outcome of the measurements on the register. With this proposal we can achieve an average fidelity between the environment and the agent of more than 90% with less than 30 iterations of the protocol. In addition, we extend the formalism to d -dimensional states, reaching an average fidelity of around 80% in less than 400 iterations for d= 11, for a variety of genuinely quantum as well as semiclassical states. This work paves the way for the development of quantum reinforcement learning protocols using quantum data, and the future deployment of semi-autonomous quantum systems.
READ FULL TEXT