Defining Quantum Neural Networks via Quantum Time Evolution

05/27/2019 ∙ by Aditya Dendukuri, et al. ∙ 0

This work presents a novel fundamental algorithm for for defining and training Neural Networks in Quantum Information based on time evolution and the Hamiltonian. Classical Neural Network algorithms (ANN) are computationally expensive. For example, in image classification, representing an image pixel by pixel using classical information requires an enormous amount of computational memory resources. Hence, exploring methods to represent images in a different paradigm of information is important. Quantum Neural Networks (QNNs) have been explored for over 20 years. The current forefront work based on Variational Quantum Circuits is specifically defined for the Continuous Variable (CV) Model of quantum computers. In this work, a model is proposed which is defined at a more fundamental level and hence can be inherited by any variants of quantum computing models. This work also presents a quantum backpropagation algorithm to train our QNN model and validate this algorithm on the MNIST dataset on a quantum computer simulation.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep Learning is a highly successful field of computer science, playing an integral role in cutting-edge technology like pattern recognition and self-driving cars. With recent developments in deep learning, fields like Computer Vision have been revolutionized with significant advancements in pattern recognition, classification, etc. In this work, we focus on the relevance of the quantum framework of information, namely Quantum Computing

, to image classification using deep learning. Image classification is currently one of the most rapidly-changing research areas in deep learning. Representing image pixel by pixel using classical information requires enormous amounts of computational resources. Hence, exploring methods to represent images in a different paradigm is important. Deep Neural Networks in Quantum Information or Quantum Neural Networks (QNNs) have been getting a lot of attention in the past couple of years due to recent advancements in quantum computing. One of the most popular methods of QNN’s is the Variational Quantum Circuit. Variational Quantum Circuits however have a disadvantage. The frontier models of variational quantum circuits are only limited to the Continous Variable Paradigm of quantum computers. Hence we need to build the quantum machine learning theory around more fundamental concepts of Quantum Theory which can universally apply to any quantum computing paradigm. In this work, one of the main focal points is to develop the notion of Variational Quantum Circuits in terms of fundamental concepts in quantum theory, specifically Hamiltonian operators and time evolution of quantum states. Our second focus is to test experimentally the performance of variational quantum circuits for the MNIST handwritten digits database. Our contributions can be compiled as follows.

Contributions of this Work This work defines a Quantum Neural Network (QNN) model as a Hamiltonian dictating the evolution of quantum states. This work also includes derivation of theoretical expression for quantum squared loss and a Quantum Backpropogation algorithm

was developed based on the definition. This work then demonstrates how this fundamental definition can translate to a specific quantum system by experimentally testing our QNN model by training it to identify handwritten digits on MNIST database in a Quantum Computer Simulation library QuEST

[1]. The proposed method is able to obtain 64% accuracy for MNIST on a Quantum Neural Network which is the highest accuracy for a large scale data-set to date.

1.1 Quantum Computation

Quantum Computing defines a non-deterministic approach to represent classical information using ideas from Quantum Theory. The idea of quantum information was first introduced in 1980 by Paul Benioff [2]. In the same year, Yuri Manin proposed a quantum computer in his textbook “Computable and Uncomputable" [3]. In the year 1982, the field was formalized and made popular by Richard Feynman in his paper about simulating physics in computers [4]

. David Deutsch further advanced the field by formulating a Quantum Turing Machine

[5]. Since then, there have been a number of algorithms developed such as the Grover’s search algorithm [6] and Shor’s factoring algorithm [7]. Shor’s factoring algorithm is particularly significant since it demonstrated exponential speedup in factoring a large number. Since, modern day encryption techniques operate using huge numbers, the ideas sprouting from the Quantum Information are already implying effect on our world. The core idea of quantum computing is a qubit

: the quantum analogue of a classical computer bit. A classical bit is capable of storing a determined value (0 or 1). A qubit (say

), can be represented using a superposition of both 0 and 1,


where, and

are vectors

and respectively. and are the probability amplitudes

. These probability amplitudes are represented using complex numbers. Hence, getting a real valued probability would mean to take the modulus squared as follows,


The probabilities are normalized, i.e. they add up to 1: . This probabilistic model potentially allows us to define algorithms which can potentially be intrinsically parallel.

Figure 1: Visualization of Rotation and Measurement of quantum state using the Bloch Sphere. and are the parameters in this case. Tuning them would mean rotating the state around the sphere and measuring in z direction means measuring in the computational basis ().

1.1.1 The Bloch Sphere and Quantum Operations

Given Eqn. (1), we can expand out the state of a qubit () into a vector as follows:

The Quantum Logic Gates that can be applied to qubits come in the form of unitary matrices. Some of these gates are shown in Table 1. The function of quantum gates can be visualized in the Bloch spheres (as shown in Figure 1). The fundamental Pauli spin gates () form a complete basis. Using this basis we can form any unitary quantum gate (in the form ) where is the Hamiltonian and t is the time evolution. Any multi qubit quantum gates like the Controlled NOT (or CNOT) gate which flips the target qubit if the control qubit is 1 can be decomposed into this basis.

Name Dirac Notation Classical Notation Function
PauliX Rotate in X direction by
PauliY Rotate in Y direction by
PauliZ Rotate in Z direction by
Hadamard Puts the state in superposition
Table 1: Examples and Demonstration of Various Quantum Logic Gates.

2 Neural Networks and Deep Learning

Classical neural networks are k-partite graphs which represent non-linear transformations. The nodes of the graph may or may not be fully connected. The first “layer” represents the input to the network as the nodes. To propagate through the network, we transform the input based on the bond strength between the nodes (weights). Let the input to the network be represented as

, and the transformation matrix hold the weights between the connections. The transformation for a layer can be modelled as:


The bias vector (

) acts as the intercept of the linear model. The linear model is then passed into a non-linear logistic function to normalize the output. If there are multiple layers, this output will be the input to the next transformation and hence, a neural network can be represented as:


We can use this function to model a variety of regression and classification problems. The weights and the biases act as tunable parameters which can be adjusted to compute the desired result. This fine-tuning process is termed as “training” the neural network. The training is carried out by an algorithm called backpropogation

. The backpropogation algorithm minimizes a “loss” function which models the accuracy of the network. Mean Squared Error is a common loss function defined as follows:


The training is carried out by updating the weights and biases such that Eqn. (4) is minimized as follows:


The partial derivatives are calculated via the chain rule as it is a sequence of composite functions or layers. Hence, every transformation in the model like the linear and the logistic non-linear transformation of every layer has been fine-tuned to generate a desired output.

3 Neural Networks in Quantum Information

Figure 2: Differentiating between a classical neural network and a variational quantum circuit. Since, input is encoded in a quantum register, there is an exponential reduction in the input space. For the quantum circuit is the encode image gate. In addition to that, the rotation gates are naturally computed in a quantum system.

Remodelling the theory behind neural networks in quantum information is gaining popularity due to many advancements in quantum computers. The earliest ideas to model a neural network in quantum computing trace back to 1995 paper by Kak [8]. In the same year Menneer and Narayanan proposed a neural network inspired by quantum processes [9]. Perus [10] suggested the advantage of quantum parallelism being applied to a neural network architecture. The first comprehensive study of a quantum neural network model was conducted by Menneer [11]

. Altaisky introduced a quantum perceptron model in

[12], but noted that the learning rule for this perceptron did not observe unitarity in general. Gupta and Zia derived a quantum neural network model from the Deutsch’s model of quantum computational network [13]. A Quantum Neural Network based on qubits was introduced by Kouda [14]. Schlud et. al proposed a set of guidelines for developing quantum neural network models[15]

. Specifically, a proposal for a generalized quantum neural network should (1) be able to encode some binary string of length N and produce as output some binary string of length M which is closest to N by some distance measure, (2) reflect one or more basic neural computing mechanisms, and (3) be based on quantum effects such as superposition and interference while remaining fully consistent with quantum theory. Quantum machine learning was experimentally tested for two class classification using quantum support vector machine

[16]. Wiebe et al. introduced the term quantum deep learning in their paper [17]. Adachi proposed a quantum neural net model which applies Quantum Annealing [18]

. A Quantum Boltzmann Machine model was introduced by Amin et al. A quantum recurrent neural network modelled after the Ising Spin Model was proposed in

[19]. Another class of quantum neural networks called variational quantum circuits with tunable parameterized unitary gates implemented in the Continous Variable Model was introduced by Killioran et. al in [20]. A Quantum Generative Adversarial Network based on the Variational Circuit Model was introduced by Seth Lloyd in [21].

One obvious difficulty in transplanting quantum computing into machine learning is the general requirement for nonlinear activation functions. A quantum register stores a state vector, which contains the probability amplitudes associated with each possible state. Clearly, this vector is subject to the normalization condition, which means that any operator applied to the system must be unitary. A unitary operator is a square matrix with the property U*U = UU* = I, where U* is the conjugate transpose of U. Thus, the actions on a quantum register are constrained by linear dynamics, which makes quantum computing fundamentally incompatible with the activation function paradigm. However, the time evolution of quantum states itself, is intrinsically non-linear (as shown in equation


3.1 Variational Quantum Circuit

There are a number of QNN models proposed based on parameterized quantum circuits, also called Variational Quantum Eigensolvers. The circuits composed of parameterized unitary gates that are optimized to produce the desired wave-function. This model can be inferred as a probability distribution similar to output of a softmax function. This is done by repeating and measuring the circuit multiple times and calculating the probability distribution of the basis states. To measure the circuit,observables such as the pauli-z spin matrix (

) can be used to measure in the computational basis as shown in Figure 1). The state of any qubit can be visualized in a Bloch sphere. This kind of circuit with tunable parameters is called a Variational Quantum Circuit as shown in Figure 2). Variational Quantum Circuits belong to a much larger family of Hybrid algorithm which require both classical and quantum components [22].

3.1.1 Advantages and Disadvantages of the Variational Quantum Circuit Model

There are a number of potential advantages of a Variational Quantum Circuits over classical deep models as follows:

  • The first motivation is the notion of quantum parallelism. Since a state can be modeled as it is holding both possibilities, any operation potentially could naturally compute both probabilities at once.

  • The natural ability of quantum gates to represent a rotation operator. There exists unitary quantum gates which represent rotations around the Bloch sphere.

  • The superposition between ‘0’ and ‘1’ enables to encode N bit information in qubits.

  • The reduction in features space also means reduction in number of layers and nodes needed.

However these advantages come with their caveats. Even though the notion of the quantum parallelism is theoretically sound, the “quantum advantage” by which a quantum algorithm outperforms a classical one has not yet been experimentally demonstrated. This is because it is extremely difficult to maintain quantum phenomena in real physical systems [23]. The present quantum systems are highly susceptible to external noise resulting in depletion of the superposition. This phenomenon is called decoherence. This issue is usually addressed by a phenomenon called error-correction, though decoherence must be suppressed below a certain threshold to achieve fault-tolerant operation and the physical resource demands of current error-correcting codes are significant.

3.1.2 Encoding Input in a Quantum Register

Indeed, the state of N qubits can be mathematically represented using dimensional Hilbert Space. We encode our by bit input image in this Hilbert space. Let us consider bits of classical input () to be encoded in a quantum state with qubits. Encoding the input in the coefficients of basis states is shown below:

3.2 Quantum Neural Network based on Time Dependent Schrödinger Equation

Although the Variational Quantum Circuit model fairly mimics a quantum analogue to a classical neural network, they are not theoretically congruent. In other words, a fully connected neural network with multiple layers cannot be exactly converted to a Variational Quantum Circuit. Another important factor would be to focus on using unique quantum properties like entanglement as most of these quantum operations can be simulated in a classical system using supercomputers. Hence, we propose a quantum neural network model based on these important facts. Our main motivation behind modelling this quantum neural network is to utilize the quantum properties to the fullest.

3.3 The Proposed Model

Consider a quantum register with an initial state with an input encoded. The time evolution of any Quantum State can be defined as follows,


where, is called the Hamiltonian of the system which dictates the evolution of the system. Hence, we will define the Evolution Structure of Quantum Neural Network in the Hamiltonian basis.


where are real numbers defined as trainable weights. denotes any fundamental quantum gates like the pauliX and Identity gates. We denote the operation in Eqn. 6 as . If we insert this Hamiltonian into Eqn. 6, we get,


There is no requirement to introduce any logistic non-linear function since this system is intrinsically non-linear. Eqn. (8) is the theoretical model of our proposed quantum neural network. We can translate any of these matrix exponentials to physical quantum gate matrices (Table 1) [24]. Consider a matrix exponential as shown below for a one-qubit case with pauliX operator and weight w:

Hence, we can construct parameterized quantum gates for any variational quantum circuit with this definition. This can be achieved to model any kind of rotation operation in any direction as long as the matrices defined in are unitary.

3.3.1 Defining Loss and Training

To optimize our QNN, we need to define a loss function to measure the performance of the network. We first need to encode the labels from the dataset in quantum states (say ). Since we are working with complex numbers, squaring a complex number means multiplying with its conjugate. If the dataset size is m, we define the cost function as shown below:


Where and refer to the complex conjugates of and respectively. To optimize the neural network we need to minimize this cost function. This can be done by computing the gradient of the cost with respect to the weights and updating the weights to optimize the network as given below.

We would like to compute the partial with respect to weight. This process is shown below:


The derivatives in equation 10 can also be computed numerically using finite difference method as shown in [20].

4 Experiments

A fundamental model of QNNs has been defined in previous Section. In this section, we show how these can be converted to specific circuit models. The problem we tackled is classification of the MNIST handwritten digit database.

4.1 Variational Quantum Circuits on MNIST Dataset

The MNIST dataset contains 60,000 training samples and 10,000 testing samples. Each of the grey scale images contains a single handwritten digit that the model must correctly identify. In order to encode

values in a quantum state, the model requires 10 qubits. So a high performance quantum circuit simulation library called QuEST has been used in this experiment. The only prepossessing that wasperformed on the images was to zero pad them to

to fill the quantum space. To model the quantum circuit, we arbitrarily choose a structure and fine tuned it based on the performance. This is similar to the notion of deciding the number of layers and nodes for classical ANNs. Therefore, the structure was defined in a completely in an ad-hoc fashion. The circuit consists of the image encoder followed by the modular structure shown in Figure 3

. On a quantum computer, these probabilities could be estimated by running the circuit many times and forming a probability histogram. In our simulations, we measured these probabilities directly. We used the mean square error between the normalized probability and a one-hot representation of the label to calculate the loss as shown in Section

3.3.1. The weights were updated using mini-batch gradient descent, with a batch size of 10 images.

Figure 3: Schematic of the Variational Quantum circuit used for this case. Every layer is defined as a series of rotation gates in the x and y directions. Every qubit is linked to one other using CNOT gates.

4.2 Results and Discussion

Figure 4: Visualizing the Training Process in relation to number of quantum layers. We see increased accuracy with number of layers. The gate complexity for each layer is with n being number of quantum nodes in each layer.

The proposed method experiments a number of trials based on number of quantum layers. The proposed method achieves 64% of the recognition accuracy, the highest among the current quantum neural network models. In this experiment, the learning rate set as

which decays by 0.99 every epoch. We see an increase in performance with increase in number of layers. The experiment results are laid out in Table

2. In Figure 4, the optimization process can be visualized as the evolution of network accuracy with every epoch. However, this was run in a simulation rather a real quantum system since there are very few accessible quantum computers with 10 usable qubits. This is an important point because simulating exponentially increasing dimensional Hilbert spaces in classical computers requires enormous amounts of resources. However, in an actual physical quantum computer most of the matrix transformations are computed naturally and require no computational power, potentially making QNN’s much superior to classical ANN’s in terms of performance.

Number of Layers Train Accuracy (%) Test Accuracy (%) Convergence in (Epochs)
4 38.8 37.3 92
6 47.0 50.1 103
10 56.7 57.2 59
20 64.08 64.74 10
Table 2: Experimental Results.

5 Conclusion and Future Work

In this paper, we have presented a novel fundamental algorithm for for defining and training Neural Networks in Quantum Information based on time evolution and the Hamiltonian. A new deep learning based model has been introduced with more fundamental level and hence can be inherited by any variants of quantum computing models. In addition, we has proposed a new quantum backpropagation algorithm to train the new QNN model and validate this algorithm for the MNIST dataset on a quantum computer simulation. The future work will be highly focused on addressing the Information Loss of Quantum States due to measurement. This problem can be addressed by maximally entangling the quantum state. Another investigation will be conducted in running and testing the QNN model in short term Quantum Processors.