I Introduction
Quantum computing is a new paradigm of computing that uses the principles of quantum mechanical systems such as superposition and entanglement. This new paradigm has increasing been used to develop algorithms with superior computational performance when compared to classical counterparts [13, 8]. The principle of amplitude amplification [2] has been used to develop computationally efficient algorithms for risk analysis and inference in the context of Bayesian network models [9, 16]. Bayesian networks are probabilistic graphical models that are widely used for uncertainty representation and propagation, risk analysis, and probabilistic inference with applications in several domains of science, engineering, and healthcare such as transportation, logistics, bioinformatics, civil infrastructure, manufacturing, and radiotherapy treatment [1]. A Quantum Bayesian network (QBN) is a quantum version of the classical Bayesian network. In order to use the developed quantum algorithms, the Bayesian networks should be represented on a quantum computing hardware.
In our prior work [1], we developed a generic approach to develop a quantum circuit on the IBM quantum gate architecture [4]
to represent any given Bayesian network. We referred to this approach as Compositional Quantum Bayesian Network (CQBN) as the overall circuit is obtained by composing smaller circuits relating to marginal/conditional probabilities of various nodes in the Bayesian network. More details are available in Section
III. We demonstrated the developed approach using Qiskit, a Python package from IBM that simulates quantum computing [11].In addition to the simulator, IBM also provides free access to a number of their real quantum devices with different capacities in terms of the number of qubits
[7]. There are several 5qubit devices, a 15 qubit device and a simulator available through IBMQ experience, which is an online platform that is used to access the IBM quantum devices.As these devices are physical systems, they are affected by several types of noise in the qubits, implementation of quantum gates and in the operating environment. Recently, several studies focus on evaluating the effect of noise, including decoherence, on performance of these devices and finding solutions to mitigate the noise[3, 6, 17]. The computational performance of algorithms that run on the devices also depends on the mapping of variables to various qubits in the devices. Even on the same device, the noise in the implementation of gates on various qubits is different. Since the number of gate operations applied on different qubits is different, it is desired to find the optimal mapping that minimizes the overall error.
To accomplish this, IBM provides a transpiler that outputs an optimal mapping to qubits on devices. Since different devices have different amounts of gate noise and different connectivity between qubits, the optimal mapping is output considering the gate noise and device connectivity. Assuming all the qubits are connected to each other, the input circuit can be developed using either QASM or Qiskit, and given the choice of device, the transpiler provides a transformed circuit that can be run from that device [14]. Another input to the transpiler is the amount of optimization that needs to be performed to derive the transformed circuit. There are four levels of optimization, and each level results in a different transformed circuit. More details are available in Section IV.
The goal of this paper is to answer the following questions:

Can the existing IBM QX hardware simulate Bayesian networks?

Is there a variation in the simulation accuracy across various IBM QX hardware?

Is there a variation in the simulation accuracy across the four levels of optimization available in the transpiler?
Paper Organization: Section. II provides a brief background to Bayesian networks and various quantum gates. Section III discusses the CQBN approach for the quantum ciruit representation of a QBN. Section. IV discusses various IBM QX hardware, their errors and different levels of the transpiler. Section. V discusses the evaluation study of executing an illustrative 4node Bayesian network on various hardware and at different levels of optimization followed by concluding remarks and future work in Section VI.
Ii Background
Iia Bayesian networks
A Bayesian Network (BN) is a directed acyclic graphical model with nodes and edges that are used to represent various random variables and dependence among them respectively. Mathematically, a Bayesian network represents a joint probability distribution over a set of random variables as a product of marginal and conditional probability distributions.
In a BN with nodes given by set
, the joint distribution can be written as
(1) 
where refers to the set of parent nodes associated with . For root nodes (nodes without parent nodes or the nodes at the top of a BN), becomes equal to .
Fig. 1 represents a simple BN with two discrete variables A and B, each of which can take two values  0 and 1. Here, A is a root node and the parent node of B. For each value of the parent node (A=0,1), we will have a conditional probability table of the child node (B).
IiB Quantum Gates
In this subsection, we will briefly introduce quantum gates that are later used in the development of quantum circuit using the CQBN approach.
It is a wellknown theorem that the set CNOT formed a universal set of gates for quantum computing [13]. Any qubit unitary operation, represented by a matrix, can be approximated up to an arbitrary precision by a sequence of gates consisting of gates from the set . The CNOT (ControlledNOT) gate is twoqubit gate, where one qubit acts as a control qubit and the other qubit is the target qubit. The matrix representations of various gates using the computational basis of and are:
Here, is a identity matrix. The quantum state of the target only changes by a Pauli X gate, when the control is in state and where
Rather than being able to implement various singlequbit gates, , and , IBM hardware can implement an arbitrary singlequbit gate, by setting the three parameters, , and . Here, represents the angle of rotation about the Yaxis, and and represent the angles of rotation around the Zaxis in the Bloch sphere [1]. The matrix representation of gate is given as
(2) 
However, any multiqubit gate other than CNOT must be decomposed into a combination of CNOT and singlequbit gates before it can be implemented on the actual hardware.
The generic gate is often called simply gate. Implementing a general gate can be prone to hardware errors; therefore, IBM allows two additional singlequbit gates, and , which are special cases of . and [7]. We can now represent the singlequbit rotation of around the Yaxis, gate, as
(3) 
A controlledU gate, , is an an application of the gate to the target qubit(s) when the control qubit is . For example, the controlled, , performs the Yaxis rotation of to the target qubit when the controlled qubit is in the state . It can be written as,
(4) 
The gate can be generalized to a general nqubit controlled gate denoted as for . Hence in general, means a unitary operation will be applied to the target qubit(s) when all the controlled qubit(s) are in the state . CCNOT (or CCX or Toffoli) and are two examples. These are threequbit gates, with two controlled qubits and one target qubit. The threequbit gates are not elementary gates; therefore, they are required to be decomposed into a sequence of singlequbit and CNOT gates [14].
Iii Quantum Bayesian networks
In this section, we discuss the general approach to represent a Bayesian network on the IBM gate architecture using the gates discussed in Section IIB, and illustrate the approach for the twonode Bayesian network in Section IIA.
We follow three key ideas when representing a Bayesian network using the gate architecture given below [1].

Map each node in a BN to one or more qubits (based on the number of discrete states of the random variable)

Map the marginal/conditional probabilities of each node to the probability amplitudes (or probabilities) associated with various states of the qubit(s).

Obtain the desired probability amplitudes of various quantum states through (controlled) rotation gates.
In CQBN approach, we represent the marginal/conditional probabilities of each node using appropriate (controlled) rotation gates, and we obtain the overall circuit of the BN by composing the rotation gates of various nodes in the order of the nodes in BN. We start with the root nodes, then represent all child nodes whose parents are the root nodes, and procedure is continued until all the nodes are represented in the quantum circuit. In Fig. 1, we begin with the root node (A), and then represent the child node (B).
We begin the circuit with the representation of root nodes using gates. Applying the rotation gate transforms to . The probabilities associated with and states are and respectively. In Fig. 1, A is the root node with states 0 and 1. If and represent the probabilities of states 0 and 1, then the rotation angle () can be calculated as
(5) 
Since the probabilities of a child node are dependent on the values of the parent nodes, we calculate rotation angles for every combination of parent node values, and implement these angles using controlled rotation gates, where the parent nodes act as control qubits and the child node is the target qubit. In Fig. 1, B is the child node with A as the parent node. Since A can take two values, we will have two rotation angles of B ( and ) representing its probabilities for and respectively. Fig. 2 provides the quantum circuit of the Bayesian network in Fig. 1.
In a gate, the gate is applied on the target qubit when the control qubit is . Therefore, to represent the conditional probabilities of when , we flip the qubit relating to A using the Pauli X gate (discussed in Section IIB), and flip it back after applying the controlled rotation. It should be noted that is not an elementary gate but Qiskit allows for its application; however, it will be decomposed into a combination of CNOT and gates in the backend.
When the number of parent nodes is greater than 1, then the conditional probabilities of a child node are realized using higherorder controlled rotations ().When , Qiskit does not allow us to represent gates directly. In order to represent such higherorder rotations, we will use additional qubits called ancilla qubits [1]. The example used in the evaluation study (Section V) uses ancilla qubits. Also, more details on representing higherorder controlled rotations are available in [1].
Iv IBM QX Hardware
The nine quantum devices that are used in this study are Burlington, Ourense, Vigo, Essex, London, Rome, Athens, Yorktown, and Melbourne. Based on the number of qubits and the architecture, these devices can be divided into four groups. The first group consists of devices with five qubits in a Tshaped architecture. Devices in this group are Burlington, Ourense, Vigo, Essex and London. The second group includes devices with five qubits arranged in a line architecture. This group consists of two devices, Rome and Athens. The remaining two devices have two distinct architectures. Yorktown is a fivequbit device with qubits in a bowtie configuration. Melbourne is a 15qubit device where the qubits are arranged in a box configuration. In total, we have eight fivequbit devices and one with 15 qubits. The architectures of all the devices are shown in Figure 3. The circles represent the qubits and the arrows indicate the connectivity among the qubits.
Errors: Along with the architectures, Fig. 3 also provides color scales of two types of errors for each device: singlequbit and CNOT error rates. The colors of the qubits (circles) represent the error while the colors of the connections represent the CNOT error. It can be observed that different devices (even with the same architecture) have different singlequbit and CNOT error rates. The error is characterized using randomized benchmarking method [10] where a qubit would be taken in a random walk over a route on the Bloch sphere that starts from state and the qubit is expected to go back to
in the end. This walk is performed by applying a set of single qubit gates. Increasing the number of these gates will exponentially decrease the chance of going back to the initial state. This decay rate can be used to estimate the average error rate for those singlequbit gates
[10]. The CNOT error rate is also estimated using a similar approach but using twoqubit Clifford gates [10].Moreover, the error rates on various devices are periodically updated. Therefore, the Fig. 3 also provides a timestamp when these error snapshots are taken. From Fig. 3, it can noticed that the calibration timestamps for Rome and Athens were more recent when compared against the other devices as they are recently made available by IBM to the public with free access.
Transpiler: As discussed in Section I, the transpiler is used to transpile a given circuit into a circuit that can be executed on a given quantum device after considering the singlequbit and CNOT error rates. Another input to the transpiler for transpiling a given quantum ciruit is the optimization level. The optimization level determines the amount of optimization that needs to be performed in obtaining a transpiled circuit. There are four levels of optimization: Level 0 (no optimization), Level 1 (light optimization), Level 2 (medium optimization), and Level 3 (heavy optimization). As the optimization level increases, the transpilation time to obtain the optimal implementation of that circuit also increases [14].
At Level 0, there would not be any explicit optimization other than mapping a given circuit to the desired backend device. This is useful for characterization experiments such as randomized benchmarking [10] or error amplification where we do not want the transpiler to apply any optimization [14]. At Level 1, the transpiler performs light optimization by collapsing adjacent gates; this combines a chain of single qubit gates (such as rotations) to one gate when feasible. Level 2 optimization includes noise adaptive qubit mapping and gate cancellation by considering commutativity rules for the gates; and finally, Level 3 optimization includes noise adaptive qubit mapping, gate cancellation using commutativity rules along with unitary synthesis. In Qiskit, the default level of optimization is Level 1 [14].
V Evaluation Study
In our evaluation study, we simulated the 4node Bayesian network given in Fig. 4 on all the nine quantum devices discussed in Section IV along with IBM Qiskit simulator, and compared the results against classical analysis (performed using Netica software [12]). The Bayesian network is obtained from [15], and is used for stock price prediction of an oil company. The variables IR (Interest Rate) and OI (Oil Industry) are the root nodes, and SM (Stock Market) and SP (Stock Price) are the child nodes; these are discrete variables with two values, 0 and 1. In this example, we have two root nodes (IR and OI), and two child nodes, one with one parent node (SM) and the other with two parent nodes (SM, OI).
Quantum Circuit: First, we construct the quantum circuit using the Qiskit package following the procedure discussed in Section III
. This circuit is then run on all the nine hardware devices and the simulator. We considered 8192 shots in each run, as it is the highest number of shots allowed on the IBM devices. The results from each run are used to compute the marginal probabilities of all the nodes. Since each variable takes two values, we will have two marginal probability values, and the sum of them is equal to unity. Therefore, we computed only the probabilities of all the variables being equal to 0 as the probabilities equal to 1 can be obtained by subtracting from unity. Since the measurements obtained from quantum circuits are probabilistic in nature (probabilities based on probability amplitudes of qubits), we performed 10 runs on each device (each run with 8192 shots) and obtained the mean and standard deviation values of the marginal probabilities across the 10 runs. Moreover, we considered all the four optimization levels for each hardware device. This comparison lets us investigate the effect of hardware noise on the accuracy of the results at different optimization levels.
To ensure that the experimental conditions remain constant throughout the runs, all the experiments for each computer were performed simultaneously at the same calibration for all the runs [5]. In this way, the variation of noise over time would have the minimal effect on the results.
Fig. 5 provides the circuit corresponding to the Bayesian network in Fig. 4. Since SP has two parent nodes, we will need to implement gates to realize its conditional probabilities. As discussed in Section III, implementation of requires the use of an ancilla qubit. Therefore, the quantum circuit in Fig. 5 has five qubits, and a measurement bit to store the measurements. In Fig. 5, represents the ancilla and qubits and correspond to variables IR, OI, SM, and SP respectively. gate is decomposed into a combination of and CCNOT gates using the ancilla qubit.
Results: Table I provides the mean and standard deviation values of marginal probabilities when run of various IBM hardware at different optimization levels compared with the results from Qiskit simulator and classical analysis. The calibration timestamp and the errors of singlequbit and CNOT gates when the runs are performed are the same as given in Fig. 3. Since the simulator is not affected by noise, the results are the same across all the optimization levels.
P(IR=0)  P(OI=0)  P(SM=0)  P(SP=0)  
Netica  0.750  0.600  0.425  0.499 
Simulator  0.750 (0.006)  0.601 (0.003)  0.425 (0.005)  0.499 (0.006) 
Optimization Level 0  
Burlington  0.240 (0.015)  0.536 (0.016)  0.476 (0.022)  0.598 (0.022) 
Vigo  0.449 (0.021)  0.544 (0.012)  0.464 (0.026)  0.567 (0.018) 
Ourense  0.310 (0.028)  0.529 (0.015)  0.502 (0.019)  0.576 (0.017) 
London  0.283 (0.021)  0.558 (0.052)  0.504 (0.050)  0.618 (0.064) 
Essex  0.330 (0.034)  0.512 (0.019)  0.493 (0.022)  0.562 (0.029) 
Yorktown  0.718 (0.024)  0.567 (0.054)  0.454 (0.054)  0.497 (0.025) 
Rome  0.393 (0.052)  0.499 (0.007)  0.463 (0.007)  0.674 (0.017) 
Athens  0.410 (0.013)  0.550 (0.004)  0.470 (0.011)  0.601 (0.010) 
Melbourne  0.440 (0.068)  0.582 (0.024)  0.506 (0.030)  0.721 (0.056) 
Optimization Level 1  
Burlington  0.249 (0.069)  0.523 (0.013)  0.472 (0.027)  0.586 (0.024) 
Vigo  0.440 (0.035)  0.547 (0.010)  0.474 (0.013)  0.558 (0.013) 
Ourense  0.356 (0.032)  0.515 (0.018)  0.499 (0.028)  0.566 (0.020) 
London  0.370 (0.021)  0.540 (0.017)  0.491 (0.019)  0.579 (0.021) 
Essex  0.332 (0.020)  0.532 (0.023)  0.492 (0.018)  0.578 (0.022) 
Yorktown  0.622 (0.158)  0.579 (0.042)  0.454 (0.028)  0.502 (0.026) 
Rome  0.386 (0.061)  0.516 (0.009)  0.464 (0.007)  0.669 (0.020) 
Athens  0.424 (0.009)  0.548 (0.008)  0.463 (0.007)  0.587 (0.005) 
Melbourne  0.449 (0.065)  0.569 (0.025)  0.499 (0.070)  0.709 (0.043) 
Optimization Level 2  
Burlington  0.362 (0.017)  0.526 (0.022)  0.461 (0.029)  0.659 (0.041) 
Vigo  0.479 (0.040)  0.532 (0.022)  0.476 (0.019)  0.582 (0.018) 
Ourense  0.395 (0.041)  0.536 (0.023)  0.491 (0.026)  0.578 (0.013) 
London  0.426 (0.037)  0.524 (0.012)  0.509 (0.021)  0.606 (0.015) 
Essex  0.842 (0.015)  0.551 (0.015)  0.446 (0.020)  0.597 (0.023) 
Yorktown  0.652 (0.159)  0.610 (0.015)  0.462 (0.005)  0.474 (0.014) 
Rome  0.840 (0.017)  0.541 (0.005)  0.522 (0.007)  0.702 (0.014) 
Athens  0.832 (0.012)  0.511 (0.006)  0.510 (0.006)  0.630 (0.009) 
Melbourne  0.710 (0.185)  0.574 (0.038)  0.482 (0.042)  0.674 (0.049) 
Optimization Level 3  
Burlington  0.526 (0.055)  0.530 (0.018)  0.476 (0.031)  0.620 (0.040) 
Vigo  0.525 (0.019)  0.527 (0.015)  0.517 (0.020)  0.561 (0.033) 
Ourense  0.497 (0.010)  0.536 (0.011)  0.494 (0.020)  0.567 (0.031) 
London  0.552 (0.008)  0.542 (0.009)  0.506 (0.019)  0.601 (0.028) 
Essex  0.776 (0.060)  0.577 (0.026)  0.483 (0.037)  0.555 (0.023) 
Yorktown  0.626 (0.130)  0.600 (0.011)  0.459 (0.014)  0.465 (0.026) 
Rome  0.869 (0.021)  0.553 (0.006)  0.503 (0.008)  0.706 (0.019) 
Athens  0.846 (0.020)  0.526 (0.008)  0.471 (0.016)  0.562 (0.029) 
Melbourne  0.732 (0.102)  0.586 (0.025)  0.474 (0.058)  0.629 (0.044) 
Performance comparison: For comparison of results in Table I, we calculated the root mean square percentage error (RMSPE) using the expression in Eq. 6. The RMSPE error values are given in Table II.
(6) 
Here, is the RMSPE, and are the true and expectation values (over 10 runs). The true values are obtained from classical analysis, using Netica software.
Level 0  Level 1  Level 2  Level 3  

Simulator  0.1%  0.1%  0.1%  0.1% 
Burlington  36.3%  35.6%  31.3%  21% 
Vigo  11.1%  11.4%  21.5%  20.5% 
Ourense  32.2%  29.3%  26.7%  20.6% 
London  34.8%  28.2%  26.8%  19.8% 
Essex  30.7%  30.6%  12.5%  9.1% 
Yorktown  4.9%  9.4%  8.3%  9.8% 
Rome  31.1%  31.8%  24.6%  24.4% 
Athens  25.8%  24.3%  18.9%  12.2% 
Melbourne  31.9%  30.4%  19.1%  14.4% 
We make the following observations from Table II.

The results from the simulator are almost the same as the results from classical analysis.

For most devices (Burlington, Ourense, London, Essex, Rome, Athens, and Melborune), the percentage error decreases with increase in the optimization level.

The error rate for Yorktown is the least at all the optimization levels when compared against all the hardware (at Level 3, the error rate for Yorktown was slightly higher than Essex but only by 0.7%)

The best result across different optimization levels and various hardware is by Yorktown at Level 0 optimization.

Of all the devices, the error rate significantly decreased for Essex across various optimization levels (30.7% at Level 0 to 9.1% at Level 3)
Since most devices have the best performance at Level 3 optimization, we provided box plots in Fig. 6 for all the marginal probabilities to study the variation in results across the 10 runs. In Fig. 6, the devices are available on the Xaxis while probabilities are plotted on the Yaxis. The red line in each plot represents the true probability obtained from classical analysis. For a fair comparison, we used the same scale on Yaxis in all the plots. In Fig. 6, Yorktown has the largest ranges across all devices; this is particularly evident in the first plot corresponding to P(IR=0).
Vi Conclusion
This paper discussed an experimental evaluation of the performance of nine IBM QX hardware (Burlington, Vigo, Ourense, London, Essex, Yorktown, Rome, Athens, and Melbourne) in simulating a Quantum Bayesian network. First, we developed a quantum circuit to represent the Bayesian network using Qiskit, which is Python package for simulating quantum computation. The circuit is then transpiled to run on various hardware, and their performance was compared against that from Qiskit and classical analysis. We also considered all the four levels of optimization (no, light, medium, and heavy optimization) when obtaining a transpiled circuit. On each device, we performed 10 runs, each run with 8192 shots. We used the root mean squared percentage error as a metric to compare the performance of various devices.
We observed that the performance of most devices (6 out of 9) improved with the optimization level. We observed the best performance for Yorktown at all the optimization levels. From the results, we conclude that the existing hardware is not very effective in simulating Quantum Bayesian network due to hardware noise. The error are significant even in a small Bayesian network (with 4 nodes), and we can expect the error rates to increase in more complex Bayesian networks. One way to reduce the error rate could be by developing faulttolerant circuits and performing faulttolerant computation.
As our future work, we will consider designing fault tolerant circuits to manage the hardware noise and improve the simulation performance of Quantum Bayesian networks. We will also investigate techniques to develop circuits with lower depths as fewer gates can lead to overall less noisy measurements from the designed circuits.
References
 [1] (2020) Quantum circuit representation of bayesian networks. External Links: 2004.14803 Cited by: §I, §I, §IIB, §III, §III.
 [2] (2002) Quantum amplitude amplification and estimation. Contemporary Mathematics 305, pp. 53–74. Cited by: §I.
 [3] (2019) Validating quantum computers using randomized model circuits. Physical Review A 100 (3), pp. 032328. Cited by: §I.
 [4] (2016) Quantum gates and architecture for the quantum simulation of the fermihubbard model. Physical Review A 94 (6), pp. 062304. Cited by: §I.
 [5] (2016) Quantum fault tolerance in small experiments. arXiv preprint arXiv:1610.03507. Cited by: §V.
 [6] (2019) Faulttolerant logical gates in the ibm quantum experience. Physical review letters 122 (8), pp. 080504. Cited by: §I.
 [7] IBM q experience. Note: https://quantumcomputing.ibm.com Cited by: §I, §IIB.

[8]
(2018)
Quantum machine learning for data scientists
. arXiv preprint arXiv:1804.10068. Cited by: §I.  [9] (2014) Quantum inference on bayesian networks. Physical Review A 89 (6), pp. 062315. Cited by: §I.
 [10] (2012) Characterizing quantum gates via randomized benchmarking. Physical Review A 85 (4), pp. 042311. Cited by: §IV, §IV.
 [11] (2018) Qiskit backend specifications for openqasm and openpulse experiments. arXiv preprint arXiv:1809.03452. Cited by: §I.
 [12] (2019) Norsys software corporation, netica version 6.05. Note: https://www.norsys.com/ Cited by: §V.
 [13] (2002) Quantum computation and quantum information. American Association of Physics Teachers. Cited by: §I, §IIB.

[14]
(2019)
Qiskit: an opensource framework for quantum computing
. External Links: Document Cited by: §I, §IIB, §IV, §IV.  [15] (2000) Bayesian network models of portfolio risk and return. Cited by: Fig. 4, §V.
 [16] (2019) Quantum risk analysis. npj Quantum Information 5 (1), pp. 15. Cited by: §I.
 [17] (2018) An efficient quantum circuits optimizing scheme compared with qiskit (short paper). In International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 467–476. Cited by: §I.