1. Introduction
In the past few years, we have witnessed many breakthroughs in both machine learning and quantum computing research fields. On machine learning, the automated machine learning (AutoML) (Zoph and Le, 2016; Zoph et al., 2018) significantly reduces the cost of designing neural networks to achieve AI democratization. On quantum computing, the scale of the actual quantum computers has been rapidly evolving (e.g., IBM (IBM, 2020) recently announced to debut quantum computer with 1,121 quantum bits (qubits) in 2023). Such two research fields, however, have met the bottlenecks when applying the theoretical knowledge in practice. With the largesize inputs, the size of machine learning models (i.e., neural networks) significantly exceed the resource provided by the classical computing platform (e.g., GPU and FPGA); on the other hand, the development of quantum applications is far behind the development of quantum hardware, that is, it lacks killer applications to take full advantage of highparallelism provided by a quantum computer. As a result, it is natural to see the emerging of a new research field, quantum machine learning.
Like applying machine learning to the classical hardware accelerators, when machine learning meets quantum computers, there will be tons of opportunities along with the challenges. The development of machine leering on the classical hardware accelerator experienced two phases: (1) the design of neural network tailored hardware (Zhang et al., 2015; Jiang et al., 2018; Zhang et al., 2018b; Jiang et al., 2019a; Li et al., 2016, 2017), and (2) the codesign of neural network and hardware accelerator (Jiang et al., 2019c; Jiang et al., 2019b; Yang et al., 2020a; Bian et al., 2020; Jiang et al., 2020a; Ding et al., 2020; Wu et al., 2019; Cai et al., 2018; Tan et al., 2019; Hao et al., 2019b, a; Zeng et al., 2020; Wu et al., 2020). To best exploit the power of the quantum computer, it would be essential to conduct the codesign of neural network and quantum circuits design; however, with the different basic logic gates between quantum circuit and classical circuit designs, it is still unclear how to design a quantum accelerator for the neural network.
In this work, we aim to fix such a missing link by providing an opensource design framework. In general, the full acceleration system will be divided into three parts, the data preprocessing and data postprocessing on a classical computer, and the neural network accelerator on the quantum circuit. In the quantum circuit, it will further include the quantum state preparation and the quantum computingbased neural computation. In the following of this paper, we will introduce all the above components in detail and demonstrate the implementation using IBM Qiskit for quantum circuit design and Pytorch for the machine learning model process.
The remainder of the paper is organized as follows. Section 2 presents an overview of the full system. Section 3 presents the case study on the MNIST dataset. Insights are discussed in Section 4. Finally, concluding remarks are given in Section 5.
2. Overview
Figure 1 demonstrates three types of neural network design: (1) the classical hardware accelerator; (2) the pure quantum computing based accelerator; (3) the hybrid quantum and classical accelerator. All of these accelerators follow the same flow that the data will be first preprocessed, then the neural computation is accelerated, and finally, the output data will go through the postprocessing to obtain the final results.
2.1. Classical acceleration
After the success of deep neural networks (e.g., Alexnet (Krizhevsky et al., 2017) and VGGNet (Simonyan and Zisserman, 2014)) in achieving high accuracy, designing hardware accelerator became the hot topic in accelerating the execution of deep neural networks. On the applicationspecific integrated circuit (ASIC), works (Du et al., 2015; Zhang et al., 2018a; Zhang and Garg, 2018; Zhang et al., 2019b; Chen et al., 2016) studied how to design neural network accelerator using different dataflows, including weight stationery, output stationery, etc. By selecting dataflow for a dedicated neural computation, it can maximize the data reuse to reduce the data movement and accelerate the process, which derived the codesign of neural network and ASICs (Yang et al., 2020b).
On the FPGA, work (Zhang et al., 2015) first proposed the tiling based design to accelerate the neural computation, and works (Jiang et al., 2018; Zhang et al., 2018b; Jiang et al., 2019a; Li et al., 2015) gave different designs and extended the implementation to multiple FPGAs. Driven by the AutoML, work (Jiang et al., 2019c) proposed the first codesign framework to involve the FPGA implementation into the search loop, so that both software accuracy and hardware efficiency can be maximized. The codesign philosophy also applied in other designs (Zhang et al., 2019a; Jiang et al., 2020d; Hao et al., 2019b, a) and in this direction, there exist many research works in further integrating the model compression into consideration (Lu et al., 2019; Jiang et al., 2020c), accelerating the search process (Li et al., 2020; Zhang et al., 2020),
2.2. Pure quantum computing
Most recently, the emerging works in using the quantum circuit to accelerate neural computation. The typical work include (Francesco et al., 2019; Tacchino et al., 2020; Jiang et al., 2020b), among which the work (Jiang et al., 2020b) first demonstrates the potential quantum advantage that can be achieved by using a codesign philosophy. These works encode data to either qubits (Francesco et al., 2019) or qubit states (Jiang et al., 2020b) and use superconductingbased quantum computers to run neural networks. These methods have the following limitations: Due to the short decoherence times in the superconductingbased quantum computers
, the condition logic is not supported in the computing process. This makes it hard to implement a function that is not differentiable at all points, like the commonly used Rectified Linear Unit (ReLU) in machine learning models. However, it also has advantages, such as the design can be directly evaluated on an actual quantum computer, and there is no communication between the quantumclassical interface during the computation.
In the quantum circuit design, it includes two components: for quantum states preparation and for neural computation, as shown in Figure 1(b). After the component , it will measure the quantum qubits to extract the output data, which will be further sent to the data postprocessing unit to obtain the final results.
2.3. Hybrid quantumclassical computing
To overcome the disadvantage of pure quantum computing and take full use of classical computing, the hybrid quantumclassical computing for machine learning tasks is proposed (Broughton et al., 2020)
. It establishes a computing paradigm where different neurons can be implemented on either quantum or classical
computers, as demonstrated in Figure 1(c). This brings flexibility in implementing functions (e.g., ReLU). However, at the same time, it will lead to massive data transfer between quantum and classical computers.2.4. Our Focus in The Case Study
This work focus on providing a full workflow, starting from the data preprocessing, going through quantum computing acceleration, and ending with the data postprocessing. We will apply the MNIST data set as an example to carry out a case study.
Computing architecture and neural operation can affect the design. In this work, for the computing architecture, we focus on the pure quantum computing design, since it can be easily extended to the hybrid quantumclassical design by connecting the inputs and output of the quantum acceleration to the traditional classical accelerator; for the neural network, we focus on the multilayer perceptron, which is the basic operation for a large number of neural computation, like the convolution.
3. Case Study on MNIST Dataset
In this section, we will demonstrate the detailed implementation of four components in the pure quantum computing based neural computation as shown in Figure 1(b): data preprocessing, quantum state preparation (), neural computation (), and data postprocessing.
3.1. Data PreProcessing
The first step of the whole procedure is to prepare the quantum data to be encoded to the quantum states. Kindly note in order to utilize qubits to represent
data, it has constraints on the numbers; more specifically, if a vector
ofdata can be arranged in the first column of a unitary matrix
, then for the initial state of , we can obtain by conducting , where represents the zero state with qubits.Listing 1
demonstrates the data conversion from the classical data to quantum data. We utilize the transforms in torchvision to complete the data conversation. More specifically, we create the ToQuantumData class in Line 5. It will receive a tensor (the original data) as input (Line 6). We apply Singular Value Decomposition (svd) provided by np.linalg to obtain the unitary matrix output_matrix (Line 14), then we extract the first vector from output_matrix as the output_data (Line 16), where the output_matrix represents
and the output_data represents . After we build the ToQuantumData class, we will integrate it into one “transform” variable, which can further include the data preprocessing functions, such as image resize (Line 20) and data normalization (Line 21). In creating the data loader, we can apply the “transform” to the dataset (e.g., we can obtain train data by using “train_data=datasets.MNIST(root=datapath, train=True,download=True, transform=transform)”).3.2. : Quantum State Preparation
Theoretically, with the unitary matrix , we can directly operate the oracle on the quantum circuit to change states from the zero state to . This process is widely known as quantumstate preparation. The efficiency of quantumstate preparation can significantly affect the complexity of the whole circuit, and therefore, it is quite important to improve the efficiency of such a process. In general, there are two typical ways to perform the quantumstate preparation: (1) quantum random access memory (qRAM) (Lvovsky et al., 2009) based approach (Allcock et al., 2020; Kerenidis and Prakash, 2016) and (2) computing based approach (Sanders et al., 2019; Grover, 2000; Bausch, 2020). Let’s first see the qRAMbased approach, where the vector in will be stored in a binarytree based structure in qRAM, which can be queried in quantum superposition and can generate the states efficiently. In IBM Qiskit, it provides the initialization function to perform quantumstate preparation, which is based on the method in (Shende et al., 2006).
In Listing 2, we give the codes to initialize the quantum states, using the unitary matrix which is converted from the original data in Listing 1(see Line 18). In this code snippet, we first create a 4qubit QuantumRegister “inp” (line 6) and the quantum circuit (line 7). Then, we convert the input data to data_matrix, which is then employed to initialize the circuit using function UnitaryGate from qiskit.extensions. Finally, from line 10 to line 14, we output the states of all qubits to verify the correctness.
3.3. : Neural Computation
Now, we have encoded the image data (16 inputs) onto 4 qubits. The next step is to perform the neural computation, that is, the weighted sum with quadratic function using the given binary weights . Neural computation is the key component in quantum machine learning implementation. To clearly introduce this component, we first consider the computation of the hidden layer, which can be further divided into two stages: (1) multiplying inputs and weights, and (2) applying the quadratic function on the weighted sum. Then, we will present the computation of the output layer to obtain the final results.
Computation of one neural in the hidden layer
Stage 1: multiplying inputs and weights. Since the weight is given, it is predetermined. We use the quantum gate to operate the weights with the inputs. The quantum gates applied here include the gate and the 3controlledZ gate with 3 trigger qubits. The function of such a 3controlledZ is to flip the sign of state , and the function of gate is to swap one state to another state.
For example, if the weight for state is . We operate it on the input follows three steps. First, we swap the amplitude of state to state using two gates on the first two qubits. Then, in the second step, we apply controlledZ gate to flip the sign of the state . Finally, in the third step, we swap the amplitude of state back to state using two gates on the first two qubits. Therefore, we can transverse all weights and apply the above three steps to flip the sign of corresponding states. Kindly note that since the nonlinear function is a quadratic function, if the number of is larger than , we can flip all signs of weights to minimize the number of gates to be put in the circuit.
Listing 3 demonstrates the procedure of multiplying inputs and weights. In the list, the function cccz utilizing the basic quantum logic gates to realize the 3controlledZ gate with 3 control qubits. The involved basic gates include Toffoli gate (i.e., CCX) and controlledZ gate (i.e., CZ). Since such a function needs auxiliary (a.k.a., ancilla) qubits, we include 2 additional qubits (i.e., ) in the quantum circuit (i.e., ), as shown in Lines 1920.
The function neg_weights_gate flips the sign of the given state, applying the 3step process. Lines 1113 complete the first step to swap the amplitude of the given state to the state of . Then, the cccz gate is applied to complete the second step. Finally, from line 15 to line 17, the amplitude is swap back to the given state.
With the above two functions, we traverse the weights to assign the sign to each state from Lines 2127. Kindly note that, after this operation, the states vector changed from the initial state to where the states have the weights.
Stage 2: applying a quadratic function on the weighted sum. In this stage, it also follows 3 steps to complete the function. In the first step, we apply the Hadamard (H) gates on all qubits to accumulates all states to the zero states. Then, the second step swap the amplitude of zero state and the onestate . Finally, the last step applies the NcontrolX gate to extract the amplitude to one output qubit
, in which the probability of
is equal to the square of the weighted sum.In the first step, the H gates can be applied to accumulate the amplitude of states, because the first row of is and the performs the multiplication between the matrix and the state vector . As a result, the amplitude of will be the weighted sum with the coefficient of .
Listing 4 demonstrates the implementation of the quadratic function on the weighted sum on Qiskit. In the list, function ccccx is based on the basic Toffoli gate (i.e., CCX) to implement a 4controlX gate to swap the amplitude between the zero state and the onestate . In Line 14, is an additional output qubit in the quantum circuit (i.e., ) to hold the result for the neural computation, which is added in Lines 1011.
For a neural network with neurons in the hidden layer, it has sets of weights. We can apply the above neural computation on set of weights to obtain output qubits.
Computation of one neuron in the output layer
With these output qubits, we have two choices: (1) go to the classical computer and then encode the output of these outputs to
qubits and then repeat these computations for the hidden layer to obtain the final results; (2) continuously use these qubits to directly compute the outputs, but the fundamental computation needs to be changed to the multiplication between random variables because the data associated with a qubit represents the probability of the qubit to be
state.In the following, we demonstrate the implementation of the second choices (fundamental details please refer to (Jiang et al., 2020b; Tacchino et al., 2020)). In this example, we follow the network structure with 2 neurons in the hidden layer. In addition, we consider there is only one parameter for the normalization function using one additional qubit for each output neuron. Let be the outputs of 2 neurons in the hidden layer; let be the weights for the output neuron in the layer; let norm_flag_1 and norm_para_1 be the normalization related parameters for the output neuron. Then, we have the following implementation.
In the above list, it follows the 2stage pattern for the computation in the hidden layer. If we modify all subindex to , then we can obtain the quantum circuit for the second output neuron.
3.4. Data PostProcessing
After all outputs are computed and stored in the out_q_1 and out_q_2 qubits, we can then measure the output qubits, run a simulation or execute on the IBM Q processors, and finally obtain the classification as follows.
Listing 6 demonstrate the above three tasks. The fire_ibmq function can execute the constructed circuit in either simulation or a given IBM Q processor backend. The parameter “shots” defines the number of execution to be executed. Finally, the counts for each state will be returned. On the implementation, the probability of each qubit (instead of each state) gives the probability to choose the corresponding class. Therefore, we create the “analyze” function to get the probability for each qubits. Finally, we obtain the classification results by extracting the index of the max probability in the “class_prob” set.
Kindly note that the Listing 6 can also be applied for the hybrid quantumclassical computing.
4. Insights
From the study of implementing neural networks onto the quantum circuits, there are several insights in terms of achieving quantum advantages, listed as follows.

Data encoding: this case study encodes data to quantum qubits, which provides the opportunity to achieve quantum advantage for conducting inference for each input. An alternative way is to encode data to qubits, however, with the consideration that each data needs to be operated in the neural computation, such an encoding approach can hardly achieve the quantum advantage.

Quantumstate preparation: by encoding data to quantum qubits, we can achieve quantum advantage only if the quantumstate preparation can be efficiently conducted with complexity at .

Quantum computingbased neural computation: Neural computation can also become the performance bottleneck, using the design in Listing 3 to flip one sign at each time, it requires gates in the worst case. To overcome this, (Jiang et al., 2020b) proposed a codesign approach to reduce the number of gates to .
5. Conclusion
This work demonstrates the framework in implementing neural networks onto quantum circuits. It is composed of three main components, including data preprocessing, neural computation acceleration, and data postprocessing. Based on such a working flow, the data will be first encoded to quantum states and then operated to complete the operations in a neural network. The source codes can be found in https://github.com/weiwenjiang/QML_tutorial
Acknowledgements
This work is partially supported by IBM and University of Notre Dame (IBMND) Quantum program, and in part by the IBMILLINOIS Center for Cognitive Computing Systems Research.
References
 (1)
 Allcock et al. (2020) Jonathan Allcock, ChangYu Hsieh, Iordanis Kerenidis, and Shengyu Zhang. 2020. Quantum algorithms for feedforward neural networks. ACM Transactions on Quantum Computing 1, 1 (2020), 1–24.
 Bausch (2020) Johannes Bausch. 2020. Fast BlackBox Quantum State Preparation. arXiv preprint arXiv:2009.10709 (2020).
 Bian et al. (2020) Song Bian, Weiwen Jiang, Qing Lu, Yiyu Shi, and Takashi Sato. 2020. Nass: Optimizing secure inference via neural architecture search. arXiv preprint arXiv:2001.11854 (2020).
 Broughton et al. (2020) Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J Martinez, Jae Hyeon Yoo, Sergei V Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin Halavati, Evan Peters, et al. 2020. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989 (2020).
 Cai et al. (2018) Han Cai, Ligeng Zhu, and Song Han. 2018. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018).

Chen
et al. (2016)
YuHsin Chen, Tushar
Krishna, Joel S Emer, and Vivienne
Sze. 2016.
Eyeriss: An energyefficient reconfigurable accelerator for deep convolutional neural networks.
IEEE journal of solidstate circuits 52, 1 (2016), 127–138.  Ding et al. (2020) Yukun Ding, Weiwen Jiang, Qiuwen Lou, Jinglan Liu, Jinjun Xiong, Xiaobo Sharon Hu, Xiaowei Xu, and Yiyu Shi. 2020. Hardware design and the competency awareness of a neural network. Nature Electronics 3, 9 (2020), 514–523.
 Du et al. (2015) Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: Shifting vision processing closer to the sensor. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. 92–104.
 Francesco et al. (2019) Tacchino Francesco, Macchiavello Chiara, Gerace Dario, and Bajoni Daniele. 2019. An artificial neuron implemented on an actual quantum processor. NPJ Quantum Information 5, 1 (2019).
 Grover (2000) Lov K Grover. 2000. Synthesis of quantum superpositions by quantum computation. Physical review letters 85, 6 (2000), 1334.
 Hao et al. (2019a) Cong Hao, Yao Chen, Xinheng Liu, Atif Sarwari, Daryl Sew, Ashutosh Dhar, Bryan Wu, Dongdong Fu, Jinjun Xiong, Wenmei Hwu, et al. 2019a. NAIS: Neural architecture and implementation search and its applications in autonomous driving. arXiv preprint arXiv:1911.07446 (2019).
 Hao et al. (2019b) Cong Hao, Xiaofan Zhang, Yuhong Li, Sitao Huang, Jinjun Xiong, Kyle Rupnow, Wenmei Hwu, and Deming Chen. 2019b. FPGA/DNN CoDesign: An Efficient Design Methodology for 1oT Intelligence on the Edge. In 2019 56th ACM/IEEE Design Automation Conference (DAC). IEEE, 1–6.
 IBM (2020) IBM. 2020. IBM’s Roadmap For Scaling Quantum Technology. https://www.ibm.com/blogs/research/2020/09/ibmquantumroadmap/ (2020). Accessed: 20200930.
 Jiang et al. (2020a) Weiwen Jiang, Qiuwen Lou, Zheyu Yan, Lei Yang, Jingtong Hu, X Sharon Hu, and Yiyu Shi. 2020a. Devicecircuitarchitecture coexploration for computinginmemory neural accelerators. IEEE Trans. Comput. (2020).
 Jiang et al. (2019a) Weiwen Jiang, Edwin HM Sha, Xinyi Zhang, Lei Yang, Qingfeng Zhuge, Yiyu Shi, and Jingtong Hu. 2019a. Achieving superlinear speedup across multifpga for realtime dnn inference. ACM Transactions on Embedded Computing Systems (TECS) 18, 5s (2019), 1–23.
 Jiang et al. (2018) Weiwen Jiang, Edwin HsingMean Sha, Qingfeng Zhuge, Lei Yang, Xianzhang Chen, and Jingtong Hu. 2018. Heterogeneous fpgabased costoptimal design for timingconstrained cnns. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 37, 11 (2018), 2542–2554.
 Jiang et al. (2019b) Weiwen Jiang, Bike Xie, ChunChen Liu, and Yiyu Shi. 2019b. Integrating memristors and CMOS for better AI. Nature Electronics 2, 9 (2019), 376–377.
 Jiang et al. (2020b) Weiwen Jiang, Jinjun Xiong, and Yiyu Shi. 2020b. A CoDesign Framework of Neural Networks and Quantum Circuits Towards Quantum Advantage. arXiv preprint arXiv:2006.14815 (2020).
 Jiang et al. (2020c) Weiwen Jiang, Lei Yang, Sakyasingha Dasgupta, Jingtong Hu, and Yiyu Shi. 2020c. Standing on the shoulders of giants: Hardware and neural architecture cosearch with hot start. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 39, 11 (2020), 4154–4165.
 Jiang et al. (2020d) Weiwen Jiang, Lei Yang, Edwin HM Sha, Qingfeng Zhuge, Shouzhen Gu, Sakyasingha Dasgupta, Yiyu Shi, and Jingtong Hu. 2020d. Hardware/Software coexploration of neural architectures. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems (2020).
 Jiang et al. (2019c) Weiwen Jiang, Xinyi Zhang, Edwin HM Sha, Lei Yang, Qingfeng Zhuge, Yiyu Shi, and Jingtong Hu. 2019c. Accuracy vs. efficiency: Achieving both through fpgaimplementation aware neural architecture search. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
 Kerenidis and Prakash (2016) Iordanis Kerenidis and Anupam Prakash. 2016. Quantum recommendation systems. arXiv preprint arXiv:1603.08675 (2016).
 Krizhevsky et al. (2017) Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.

Li
et al. (2015)
Bingzhe Li, M Hassan
Najafi, and David J Lilja.
2015.
An FPGA implementation of a restricted boltzmann machine classifier using stochastic bit streams. In
2015 IEEE 26th International Conference on Applicationspecific Systems, Architectures and Processors (ASAP). IEEE, 68–69. 
Li
et al. (2016)
Bingzhe Li, M Hassan
Najafi, and David J Lilja.
2016.
Using stochastic computing to reduce the hardware requirements for a restricted Boltzmann machine classifier. In
Proceedings of the 2016 ACM/SIGDA International Symposium on FieldProgrammable Gate Arrays. 36–41. 
Li
et al. (2017)
Bingzhe Li, Yaobin Qin,
Bo Yuan, and David J Lilja.
2017.
Neural network classifiers using stochastic computing with a hardwareoriented approximate activation function. In
2017 IEEE International Conference on Computer Design (ICCD). IEEE, 97–104.  Li et al. (2020) Yuhong Li, Cong Hao, Xiaofan Zhang, Xinheng Liu, Yao Chen, Jinjun Xiong, Wenmei Hwu, and Deming Chen. 2020. EDD: Efficient Differentiable DNN Architecture and Implementation Cosearch for Embedded AI Solutions. arXiv preprint arXiv:2005.02563 (2020).
 Lu et al. (2019) Qing Lu, Weiwen Jiang, Xiaowei Xu, Yiyu Shi, and Jingtong Hu. 2019. On neural architecture search for resourceconstrained hardware platforms. arXiv preprint arXiv:1911.00105 (2019).
 Lvovsky et al. (2009) Alexander I Lvovsky, Barry C Sanders, and Wolfgang Tittel. 2009. Optical quantum memory. Nature photonics 3, 12 (2009), 706–714.
 Sanders et al. (2019) Yuval R Sanders, Guang Hao Low, Artur Scherer, and Dominic W Berry. 2019. Blackbox quantum state preparation without arithmetic. Physical review letters 122, 2 (2019), 020502.
 Shende et al. (2006) Vivek V Shende, Stephen S Bullock, and Igor L Markov. 2006. Synthesis of quantumlogic circuits. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 25, 6 (2006), 1000–1010.
 Simonyan and Zisserman (2014) Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556 (2014).

Tacchino et al. (2020)
Francesco Tacchino,
Panagiotis Barkoutsos, Chiara
Macchiavello, Ivano Tavernelli, Dario
Gerace, and Daniele Bajoni.
2020.
Quantum implementation of an artificial feedforward neural network.
Quantum Science and Technology (2020). 
Tan et al. (2019)
Mingxing Tan, Bo Chen,
Ruoming Pang, Vijay Vasudevan,
Mark Sandler, Andrew Howard, and
Quoc V Le. 2019.
Mnasnet: Platformaware neural architecture search
for mobile. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
. 2820–2828.  Wu et al. (2019) Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. Fbnet: Hardwareaware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10734–10742.
 Wu et al. (2020) Yawen Wu, Zhepeng Wang, Yiyu Shi, and Jingtong Hu. 2020. Enabling OnDevice CNN Training by SelfSupervised Instance Filtering and Error Map Pruning. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 39, 11 (2020), 3445–3457.

Yang
et al. (2020a)
Lei Yang, Weiwen Jiang,
Weichen Liu, HM Edwin,
Yiyu Shi, and Jingtong Hu.
2020a.
Coexploring neural architecture and networkonchip design for realtime artificial intelligence. In
2020 25th Asia and South Pacific Design Automation Conference (ASPDAC). IEEE, 85–90.  Yang et al. (2020b) Lei Yang, Zheyu Yan, Meng Li, Hyoukjun Kwon, Liangzhen Lai, Tushar Krishna, Vikas Chandra, Weiwen Jiang, and Yiyu Shi. 2020b. CoExploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks. arXiv preprint arXiv:2002.04116 (2020).
 Zeng et al. (2020) Dewen Zeng, Weiwen Jiang, Tianchen Wang, Xiaowei Xu, Haiyun Yuan, Meiping Huang, Jian Zhuang, Jingtong Hu, and Yiyu Shi. 2020. Towards Cardiac Intervention Assistance: Hardwareaware Neural Architecture Exploration for RealTime 3D Cardiac Cine MRI Segmentation. In 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD). IEEE, 1–8.
 Zhang et al. (2015) Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing fpgabased accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on fieldprogrammable gate arrays. 161–170.
 Zhang et al. (2019b) Jeff Zhang, Parul Raj, Shuayb Zarar, Amol Ambardekar, and Siddharth Garg. 2019b. CompAct: Onchip Compression of Activations for Low Power Systolic Array Based CNN Acceleration. ACM Transactions on Embedded Computing Systems (TECS) 18, 5s (2019), 1–24.

Zhang
et al. (2018a)
Jeff Zhang, Kartheek
Rangineni, Zahra Ghodsi, and Siddharth
Garg. 2018a.
Thundervolt: enabling aggressive voltage underscaling and timing error resilience for energy efficient deep learning accelerators. In
Proceedings of the 55th Annual Design Automation Conference. 1–6.  Zhang and Garg (2018) Jeff Jun Zhang and Siddharth Garg. 2018. FATE: fast and accurate timing error prediction framework for low power DNN accelerator design. In 2018 IEEE/ACM International Conference on ComputerAided Design (ICCAD). IEEE, 1–8.
 Zhang et al. (2019a) Xinyi Zhang, Weiwen Jiang, Yiyu Shi, and Jingtong Hu. 2019a. When neural architecture search meets hardware implementation: from hardware awareness to codesign. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 25–30.
 Zhang et al. (2018b) Xiaofan Zhang, Junsong Wang, Chao Zhu, Yonghua Lin, Jinjun Xiong, Wenmei Hwu, and Deming Chen. 2018b. DNNBuilder: an automated tool for building highperformance DNN hardware accelerators for FPGAs. In 2018 IEEE/ACM International Conference on ComputerAided Design (ICCAD). IEEE, 1–8.
 Zhang et al. (2020) Yongan Zhang, Yonggan Fu, Weiwen Jiang, Chaojian Li, Haoran You, Meng Li, Vikas Chandra, and Yingyan Lin. 2020. DNA: Differentiable NetworkAccelerator CoSearch. arXiv preprint arXiv:2010.14778 (2020).
 Zoph and Le (2016) Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).
 Zoph et al. (2018) Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697–8710.
Comments
Grimmer ∙
Quantum machine learning has three basic models of learning:
exact learning based on membership queries
Probably Approximately Correct (PAC)
Agnostic learning
Exact learning
In this model, the purpose of learning is to find a function that corresponds as exactly as possible to the unknown function. Here it is possible to make inquiries and get accurate answers about the value of the unknown function for different values of the arguments. The efficiency of quantum algorithms with respect to classical algorithms in this case depends on how the efficiency of learning is measured. If the measure of efficiency is the number of queries made, then quantum algorithms overtake classical ones only polynomially, but if the measure of efficiency is learning time, then there are such classes of functions for which quantum algorithms are much faster than classical ones, provided that quantum queries (that is queries that are in quantum superposition of classical queries) can be made. I played free penny slots no download around until I realized what really matters to me.
∙ reply