Introduction
Severe acute respiratory syndrome coronavirus 2 (SARSCoV2), the causal pathogen of coronavirus disease 2019 (COVID19), has infected over 365 million people and caused over 5.65 million deaths worldwide as of January 28, 2022[5]. Even though numerous vaccines have been proven efficient to suppress the severity and infectivity of the virus, multiple new SARSCoV2 variants emerge rapidly and put great therapeutic pressure. As reported, B.1.617.2 (Indian, Delta) strain is now the global dominant variant of concern, which shows stronger transmissibility and causes higher risk of hospitalization, intensive care and death[6]. Moreover, B.1.617.2 strain weakens the antibody’s ability to neutralize the virus and increases the possibility for convalescent patients to reinfect. In December 2020, the emergence of B.1.1.529 (South African, Omicron) dramatically boosts the spreading of the virus and was proven to cause a significant decrement on the efficacy of vaccines. Other variants of concern including B.1.1.7 (British, Alpha), B.1.351 (South African, Beta), and P.1 (Brazilian, Gamma) also create chaos for the global pandemic control. Therefore, there is an urgent need to predict the variation structure of COVID19 epidemic strains like Delta and Omicron in order to take action of early prevention and quick treatment, thus contributing to the global pandemic control[1, 2, 3, 4].
Quantum computation[7]
has attracted considerable interest in both academia and industry and been widely applied to the field of machine learning. Since nearterm quantum devices are still fairly noisy, researchers mainly focus investigations on various possible hybrid quantumclassical machine learning algorithms
[16]. As an accelerator in heterogeneous computing architecture, photonic and quantum processors directly improve the performance of artificial neural network and play the role of domain specific processors. In order to bring the advantages of quantum processor into a wide range of deep learning applications, it is necessary to develop a highlevel library and other compiler software. Our selfdeveloped DeepQuantum aims to make it easy to develop quantum applications and get the help of artificial intelligence community. On that foundation, hybrid quantumclassical models including variational quantum eigensolvers, quantum convolutional neural networks
[8], quantum generative adversarial networks(GAN)
[9, 32]and quantum reinforcement learning
[13] have been developed. Additionally, quantum stylebased generative adversarial networks will be proposed to better predict COVID19 epidemic strains.Stylebased generative adversarial networks[37, 10, 12] are successfully applied to generate highresolution realistic images[40] with training dataset style and enable intuitive, scalespecific control of the synthesis. We introduce a hybrid quantumclassical model associating scalespecific control ability with quantum inherent accelerating performance[36] to optimize the training process of classic GAN. With the capability of feature control, style mixing is employed by switching from one latent code to another at the selected point in the synthesis network, thus generating a feature map equipped with mixing features.
To the greatest extent of maintaining quantum discriminator structure in our quantum stylebased GAN, we also establish an operation called quantuminspired blur convolution which is analogous to depthwise convolution in classical machine learning and it achieves quantum progressive training[11]. Then, we turn the discriminator of stylebased GAN into its quantum counterpart and adapt the original generator for SARSCoV2 RNAs. We compare classical stylebased GAN and quantum stylebased GAN by recording the training process with multiple loss functions[14, 15, 31] and results show that the training of the quantum model is more stable, converges better as well as flexible to loss functions. Meanwhile, we use fidelity to evaluate the performance of predicting epidemic strains and the results are always over 96 for Delta, 94
for Omicron according to generated samples. Quantum stylebased GAN, as an example, predicts the mutation probability of coronavirus strains by mining the past SARSCoV2 mutation RNA sequences, provides an effective way to realize the application of quantumclassical hybrid algorithms, and shows the advantages of quantum algorithms combined with deep learning.
Results and Discussion
Datasets of mutated SARSCoV2 RNAs. To collect various and essential SARSCoV2 mutation characteristics for quantum stylebased GAN traning, we include 774 mutated SARSCoV2 complete genomes from 9 variants of concern consisting of B.1.1.7, B.1.351, B.1.429, B.1.525, B.1.526, B.1.617.1, P2, B.1.617.2 and B.1.1.529. Data come from publicly available databases GISAID[18] and NCBI[17]. We construct spike protein cohort by aligning complete genomes, intercepting spike protein fragment and squeezing duplicate sequences (described in “Methods”). In the hybrid quantumclassical model, we use 734 samples from the first 8 variants as the training cohort for generating and predicting Delta mutational sites, and add 40 Omicron samples to the previous cohort as the training cohort of Omicron mutational sites. The statistical result of the mutation frequency plot is split into four parts based on the spike protein position.
Quantum encoding. To extract variation features of mutated SARSCoV2 RNAs and fit for the input of the hybrid quantumclassical model, we employ quantum encoding[39]
on the spike protein cohort. For each spike region gene sequence in the cohort, we compare it with the first identified SARSCoV2 numbered NC_045512.2 in the corresponding position. If the above two RNA base sequences are different at a given locus, we mark 1 at the same position in a new vector and mark 0 if the opposite happens. Thus, we get a series of vectors containing number 0 and 1 in the size of 1024 (since we start training from 10 qubits).
Density matrix is an Hermite operator which needs to satisfy normalization, trace equal to 1 and semipositive requirement. To meet these conditions, we first perform L2 normalization on the stated vector, then we compute its conjugate transposed vector and finally we multiply them in order to build a 10qubit density matrix. For each spike region gene sequence we generate, similar quantum encoding is conducted(described in “Methods”).
The algorithm flow of quantum stylebased GAN We propose a hybrid quantumclassical model QuStyleGAN(quantum stylebased GAN)[34, 38] for COVID19 epidemic strain prediction, which directly takes spike protein cohort as the training dataset, to generate variation structure possessing SARSCoV2 mutation characteristics and then map it to a spike region gene sequence from one of Delta or Omicron strain (details described in”Methods”). The model comprises two key parts. One is the classical stylebased generator and the other is the quantum progressive discriminator.
The classical stylebased generator (Supplementary Fig.S1d) contains a mapping network consisting of 8 fully connected layers and a synthesis network consisting of 5 blocks ranging from resolution to in steps of , two convolution layers for each block. Detailed description is shown in Supplementary stylebased GAN. The quantum progressive discriminator will be introduced later.
The workflow of quantum stylebased GAN is shown in Fig.1. We first employ quantum encoding on the spike protein cohort to get true RNA mutation sequences. At the same time, the classical stylebased generator generates fake RNA mutation sequences. Then, two kinds of data are fed in the quantum progressive discriminator to get their true or false judgment scores. According to the loss function we compute, we update parameters both in the classical generator and the quantum discriminator. Thus, onestep model training has completed.
Quantum progressive discriminator and training. The quantum progressive discriminator (Fig.2a, Supplementary Fig.S1 and QCNN) contains 5 quantum circuit modules ranging from 10 qubits to 2 qubits in steps of 2 qubits[33, 35]
. For each of the first four modules, we construct two quantum convolution layers, two quantum pooling layers and one quantuminspired blur convolution layer (shown in next section) with the corresponding number of qubits. Each quantum layer based on quantum gates corresponds to a unitary matrix. We put these unitary matrices in order as displayed in Fig.
2a. Consequent operation after aforementioned unitary operators is partial trace represented by 2 qubits observation symbols. Partial trace trashes the information of 2 qubits in the density matrix while saving others, which achieves the reduction of density matrix in steps of 2 qubits. The last module of the quantum progressive discriminator contains a quantum convolution layer, a quantum pooling layer, a quantum dense layer and a 1qubit observation. The output of this module will be the final discriminative result.Quantum stylebased GAN training adopts progressive growing methodology same as classical stylebased GAN algorithm(Supplementary Classical progressive training). By progressively introducing preconfigured quantum layers, we start training with the 2qubit block, and then add larger 2qubit blocks to the networks as visualized in Fig.2b. In order to fade new quantum layers in smoothly, we follow residual block operation and downscale density matrices to match the current 2qubit block. Different from conducting average pooling to downscale in classical progressive training, we use partial trace to downscale between 2qubit density matrices in quantum progressive training. Benefits of quantum progressive training include simplifying the original generation problem of mapping latent codes to spike protein variation structures, stabilizing training process and reducing training time as suggested in ProGAN[11].
By contrasting quantum and classical stylebased GAN with progressive training process recorded by two kinds of loss functions (details in ”Methods” and Fig.S2) every 74 steps, we may discover some advantages of quantum neural networks. In total, Fig.2c and Fig.2d illustrate that quantum stylebased GAN(QuStyleGAN) training loss is more stable and converges better than classical stylebased GAN(StyleGAN). For logistic loss, we can see that classical generator and discriminator loss alternatively oscillate and never reach convergence during the whole 3500 steps. In contrast, quantum generator and discriminator loss first intensively oscillate but converge after 1000 steps and sustain slight oscillation. Similar situation happens in relativistichinge loss, we can find more apparent convergence in QuStyleGAN after 1500 steps. These findings confirm the power of quantum neural networks and the hybrid quantumclassical model could potentially yield a quantum advantage.
Quantuminspired blur convolution.
While designing the corresponding quantum discriminator, there is a blur layer benefit for noise reduction and feature extraction in its classical counterpart. After a careful study, the blur layer is actually a group convolution with the same number of groups, input channels and output channels, also called depthwise convolution (Supplementary Group convolution and depthwise convolution). Depthwise convolution can reduce many parameters, make effective use of hardware resources, and finally concatenate all feature maps. Based on these features and quantum properties, we propose a new quantum layer called quantuminspired blur convolution.
As shown in Fig.3b, conducting a quantuminspired blur convolution needs three key steps, including partition, quantum blur convolution and concatenate operation. For simplicity, we take a feature map with the size of 4 qubits (any 2 qubits can be applied in the same way) as example. For the partition step, we first divide the feature map into some feature maps; then we quantum encode these feature maps into some 2qubit density matrices. For the quantum blur convolution step, we first construct a 2qubit parameterized quantum circuits as the quantum blur convolution kernel; then we use the kernel to evolve with density matrices. For the last concatenate step, we put evolved density matrices into a feature map; then we quantum encode the feature map into a 4qubit density matrix. Thus, a complete quantuminspired blur convolution is done.
Since this quantuminspired algorithm is an analog of depthwise convolution, it’s possible to take a brief comparison between them. With the perspective of the whole process, as shown in the right part of Fig.3
a, input tensor is divided into
feature maps and then each feature map conducts convolution using a single kernel; finally, all output feature maps are concatenated as the output tensor with the size . Obviously, as stated above, this algorithm flow has three corresponding steps to depthwise convolution. From another point view of reducing parameters and saving resources, the left part of Fig.3a shows group convolution with parameters where is the number of groups while depthwise convolution only needs parameters. Similarly, in this case as displayed in Fig.3b, the quantuminspired module can be conducted in a 2qubit parameterized quantum circuits with 5 parameters while normal quantum convolution has to be operated on a 4qubit parameterized quantum circuits with more parameters. Therefore, as the expansion of quantum circuits consumes computing resources exponentially, quantuminspired blur convolution in fact saves the computing resources and storage space greatly.Quantum model evaluation. The above quantum structure innovations are applied to the components of our quantum stylebased GAN. First, we successfully solve quantum encoding of gene sequences. No matter what format the data is, vector or matrix, we both develop a method to encode the data to a density matrix with the corresponding size. This method may inspire further research. Second, quantum progressive training is perfectly carried out in our model. By training the quantum blockwise network progressively, we can save much training time and enable stable training process. It also suggests a possible way to train a quantum network with relatively large qubits. Third, we design a new quantum layer called quantuminspired blur convolution. This quantuminspired blur convolution layer reduces parameters, saves computing resources and serves as depthwise convolution in quantum machine learning expecting having the ability of noise reduction and feature extraction. This creative construction not only corresponds to classical stylebased GAN, but also can be widely used like a normal quantum layer in later research.
Quantitatively, we evaluate our selfdeveloped quantum GAN for variation structure prediction with spike protein cohort by fidelity. We randomly choose ten generated variation structures and ten gene sequences in spike protein cohort to compute the fidelity heatmap. Although it suggests the number of quantum discriminator parameters is far less than that in classical discriminator, the heatmap indicates that all fidelities of selected generated sequences are over 96 for Delta (Fig.3d), and 94 for Omicron (Fig.3e).
As shown in Fig.3d, the darkest blue line lies in the fidelities between ten random generated Delta spike protein sequences and OU007056.1 from strain P2, which means these generated sequences have the same highest fidelity among the heatmap and they are more similar to OU007056.1. The lightest blue spatial parts show the lowest fidelity in the heatmap, which mainly concentrate in the fidelities between the forth, eighth and ninth generated sequences and OD934760.1 from B.1.17 as well as OD979867.1 from B.1.525. However, high fidelity could be the result of little mutation compared with the original SARSCoV2 and the opposite situation may contribute to low fidelity. Additionally, we can make conclusions about horizontal lines, for instance, the sixth generated sequences is more similar to MW638371.1 from B.1.429 than others in the second line.
In the same way, Fig.3e suggests the highest fidelity appears in the cross point between the 2nd generated Omicron SP sequences and MW708379.1 from B.1.526 while the lowest fidelity exits in the forth line with OD934709.1 from B.1.17.
Performance evaluation of generated spike protein variation structure. Apart from the fidelity between our selfdeveloped model generated spike protein sequence and the selected variants spike protein sequence, a total of 1000 output sequences generated from the model are presented to show their specific mutation of positions in both the whole sequence and the spike protein sequence[19]. From Fig.1a, Delta variant and Omicron variant coexist some single mutational sites but still follow its own distinctive pattern of mutation. Single nucleotide substitutions detected the already existing Delta and Omicron samples on the spike protein sequence region are accumulated around 300500 as the first peak and around 2000 as the second peak. From Fig.4, the positions with high frequencies in the generated diagram emerges on several peaks at the range of 300500 and 33003500. It has a relatively similar pattern at the first peak with the existing samples, but the newly generated sequences reveal a minor deviation from the second peak positional information of the mutation.
From the mutational frequency result of the existing Delta variant sequences, the peak of the position at 300500 in the nucleotide sequence is located in the region of the S1 upstream to the RBD region[20, 21], and the peak around 2000 in the nucleotide sequence is located at the RBD region of the spike protein[1, 22]. Within the RBD, the spike protein mutation on amino acid position 501 has the potential to increase ACE2binding affinity by increasing the time of the “open conformation” of the ACE2 binding receptor. The mutation on amino acid position 519 is believed to decrease the convalescent serum neutralization in order to affect the efficiency of the vaccine and antibody detection in the body. The mutation on amino acid position 614 shows an increment on the viral load and produces more infectious viral individuals in human in vivo[23, 24]. Each of the mutations are proven with quantitative analysis on the ability to increase infectivity or avoidance of human immune system[25, 26, 27].
From the mutational frequency result of the Omicron of the input samples, Omicron contains more amino acid deletions and mutational sites on the spike protein than Delta[42, 43]. Omicron has the same mutational sites on 478 and 501, which gives this variant the ability to increase ACE2 binding affinity. The extra mutation on 484 has the potential to cause immune escape, and the deletion on 69 and 70 is responsible for the cause of Sgene target failure (SGTF)[43].
On the biological perspective to evalute the model’s effectiveness, we use information of generated sequences from the model to develop an algorithm on R programming for integrating datas into a visual presentation of the mutational frequency for each potential site using bar plots. Based on the significant mutation on positions in the whole sequences and spike protein sequences, the difference or similarity between the generated and existing variants will be deducted and concluded to prove the possibility of the predicted variants and the variability of the model[28, 36]. As from the result frequency map, 300500(Fig.4a and 4e), 33003500(Fig.4d and 4h) are considered as the most likely range with a single nuleotide variation.
Based on the result of generated sequences on Delta variant in Fig.4a, the most favorable sites of mutation occurs at the range of 300500 among the generated sequences. Around this range of position in the nucleotide sequence, it is located in the region of the S1 upstream to the RBD region[20, 21], which indicates that 300500 has no clear structural function and does not participate the construction of the protein[29, 30]. The plausible reason is that this region of the nucleotide is on the upstream of the S1 region and might work as nucleotide that translates into regulatory proteins.
After adding 40 omicron sequences into the 734 variant samples and running the model again, even though the input frequency map occurs no obvious transformation, the result of the generated sequence provides a slight fluctuation of the mutational sites on the spike protein sequence. In Fig.4e, besides the frequent mutational occurrence between 300500, sequence at the start of the spike protein to 300 has the inclination of getting mutated. For sequences between 0 to 300 in the S1 region, it refers to the N terminal domain, which its role is still not well marked[44, 45]. However, the ability of identifying specific carbohydrate portion in human cells during the early attachment makes it a suspicious factor of increasing ACE2 binding affinity in human cells[46].
As shown in Fig.4b and 4f, there is no significant mutational sites appearing between 1000 and 2000, which is consistent with the actual situation.
Based on Fig.4c and 4g, the generated sequences for Delta and Omicron both have highly frequent mutations on the second half of the circle, which is around 2400  2800. The corresponding protein sequence is in the section of FP, HR1, and HR2 in S2 subunit that is presented in Fig.1a. FP region contains a high content of glycine and alanine, which combines with HR1 and HR2 to allow a better anchoring and viral entry into the host cell after RBD binds to the cell.
On Fig.4d and 4h, both Delta and Omicron generate new mutation sites between 3300 to the end of the spike protein sequence, which refers to the section of transmembrane domain and cytoplasmic tail domain. Both parts are believed to be highly conservative and play an important role throughout relative coronavirus[47]. However, based of insufficient informations and limited cognition towards SARSCoV2, it is hardly to make an arbitrary conclusion on whether the mutation is going to decrease infectivity of the virus or deform its mechanism to cause more severe damage.
As mentioned above, positions with higher frequency (within 300500 and 33003500) exhibited by the prediction model is a substantial indication of the future evolution on the COVID19 variants as the pandemic continues. With the assistance of the modeling, we had accomplished a task by providing a considerable significance on predicting possible mutations from a novel perspective. This result may shed light upon research on future trends of the mutation and can be applied to other virus strains. Provided a complete understanding on the function of each single region in the spike protein, and a more comprehensive knowledge towards the protein folding and other biological form change, our model is compatible to such additional knowledge and hence will become more complete and vigorous.
In this work, we build a quantum stylebased GAN for predicting spike protein variation structure of COVID19 epidemic strains with mutated SARSCoV2 RNA sequences. The model to predict future COVID19 variants with distinctive differences on mutational sites after adding Omicron samples helps convince the practicability and exclude the disturbance of randomness. The performance also shows great potential of quantum machine learning. Using metrics in quantum information, we get high fidelity as shown in Fig.3, which usually surpasses 96 for Delta, 94 for Omicron. It means our generated variation structures of COVID19 epidemic strains equip with both quantum model and biological significance. Furthermore, we obtain a stable and convergent quantum GAN training process which is not common in classical model.
There are also some limitations and future work for this hybrid quantumclassical model. First, more mutated SARSCoV2 RNAs may be better for model training and it’s accessible as existence of multiple variants. Second, more sophisticated quantum encoding can be completed in the future. Third, we can replace existing mapping method with inserting Delta or Omicron cohort in a 10qubit module in the principle of style mixing. Finally, we might also modify model generator to fit for gene sequences directly by leveraging transformer.
Fortunately, construction of framework DeepQuantum accelerates the development of our model. This framework serves as quantum version of PyTorch, which combines quantum theory with deep learning tightly. With the help of DeepQuantum, researchers could exploit more hybrid quantumclassical models in various application scenarios and bring the advantage of quantum algorithm into deep learning application.
Conclusion
In summary, quantum stylebased GAN can successfully predict variation structure of COVID19 epidemic strains and is promising to be applied to other zoonotic viruses. Based on the result, the model generates both mutational sites in a frequency table with and without the addition of omicron samples respectively and demonstrates a clear differentiation with their distinctive characteristics. This model also includes some quantum innovations such as quantum progressive training and quantuminspired blur convolution, validates the power of quantum networks because of high fidelities and stable training process.
More sophisticated conclusions may lie in the well combination of quantum algorithms and classical artificial intelligence. Our proposed model stabilizes GAN training process, which provides an example of promoting artificial intelligence(AI) with quantum computation. Moreover, with the help of quantum computation, some problems can be solved while it is difficult for classical deep learning methods. Also, the elapsed time of quantum algorithms with the same methodology and effect as classical modules reduce greatly. In addition, the participation of AI in quantum computation naturally enriches solutions to problems in quantum fields.
At last, DeepQuantum works as a simplest tool combining the torch programming of deep learning and the qiskitstyle programming of quantum computing. It could be an effective bridge to communicate quantum computation and artificial intelligence, thus contributing to the development and construction of hybrid quantumclassical models with the advantage of quantum algorithm.
Methods
Strain selection and alignment. Before picking up appropriated variant strains, a wildtype sequence called severe acute respiratory syndrome coronavirus 2 isolate WuhanHu1, which was first recorded in March 2020, was used as a reference sequence for analysis and alignment. NCBI has documented this sequence and named it as NC045512. The prevalence of the COVID 19 cases has been boosted in the past few months with the emergence of two new VOCs, Delta and Omicron. Those two variants are specifically chosen for generating the prediction of their spike protein mutation frequency due to their accelerating spreading and relatively high hazard worldwidely. Based on Fig.1a, both of the variants contain a considerable amount of mutational sites that are known to strengthen the virus. We used GSA and GISAID databases accessing from May 30 to June 29 in 2021 to derive 734 mutated SARSCoV2 sequences from 8 different variants[17, 18]. After the explosion of the Omicron variant, we derived 40 goodquality Omicron samples and added them into the training set as the second prediction model. In order to consider the accuracy and precision, collected sequences need to be ensured with their completeness and have multiple reviews.
CoVariant and CovidCG are used to provide integrated and preanalyzed data in good classified format to help us simplify our own data processing. Covariant, supported by the database of GISAID from the website, is used as a source of detailed description and protein model presentation on each of the variant
[18]. Single nucleotide variants frequency on the nucleotide sequence and its corresponding amino acid are presented as a bar plot diagram by CovidCG website. Cooccurence of different mutational sites are shown in CovidCG to provide a thorough interactive relationship between different sites.Statistical computing software is used to do data analysis and investigate the relationship or summary of our data within our intention. The frequency distribution of the mutational sites within the selected sequences are measured in a barplot using R programming with an algorithm to selectively pick positions with a high frequency by setting thresholds. The selected genome was aligned using clustalo with multialignment setting to obtain the relative probability of getting a mutation in a particular position for the construction of prediction model.
Interception and compression of spike protein gene sequences. We construct spike protein cohort to train the proposed hybrid quantumclassical model. After alignment with the first identified SARSCoV2 numbered NC_045512.2, we intercept the information of 2156325384 loci in 774 mutated SARSCoV2 complete genomes as spike protein gene sequences.
It’s easy to compute that each spike protein gene sequence contains 3822 loci, which is inconsistent with the size of 10qubits as input. Therefore, we compress some redundant information to get final spike protein gene sequences with the size of 1024 matching the input of quantum networks. We first fix all loci with mutation by comparing spike protein gene sequences with NC_045512.2 one by one. Then, we reserve positions adjacent to mutation loci. Finally, 1024 loci are supplemental by randomly selecting other locations and we record all the chosen positions. So far, we get spike protein cohort of 1024 loci.
Quantum encoding for generated feature maps. For spike protein gene sequences as training cohort, we simply operate quantum encoding on vectors. However, for generated feature maps produced by quantum stylebased GAN generator, we have more steps to do. Firstly, we need to take the diagonal of each generated feature map. By taking a square root of the diagonal, we get a generated vector and it can be applied to quantum encoding as previously described. In brief, we conduct L2 normalization and then multiply conjugate transposed form in order.
Variation structure generating and mapping. Variation structure information learned by quantum stylebased GAN is stored in generated feature maps. In order to extract accurate and reliable mutation information as well as meet biological significance, we take the following action.
By investigating similarity between 774 mutated SARSCoV2 spike protein gene sequences and NC_045512.2, we suppose 98%(97%) as the similarity of generated Delta(Omicron) spike protein gene sequences. In other words, in each generated spike protein gene sequence, only 21(31) loci mutate and other loci remain the same as NC_045512.2. To obtain these mutation positions, we first get a probability vector by extracting square root of the diagonal of each generated feature map. Next, we sort values of this probability vector and choose top 21(31) indices as mutation positions. Yet, this is insufficient unless we return these mutation positions to original spike protein gene sequence with 3822 loci or even the complete genome. In fact, we have already done such jobs as we give mutation positions of the above three types(1024, 3822 and whole) for a generated variation structure.
Mapping generated variation structure to COVID19 epidemic strains follows a principle similar to the above. Frankly, we randomly choose some aligned spike protein gene sequences of COVID19 epidemic strain Delta or Omicron and then map generated variation structure onto them in the sense of mutation positions. In this way, we obtain some new mutated gene sequences possessed of variation characteristics of SARSCoV2 variants.
Torchbased quantum deep learning. We implement our hybrid quantumclassical model QuStyleGAN by selfbuilt torchbased software framework: DeepQuantum. DeepQuantum is established in the aim of combining quantum computing with machine learning and providing a powerful tool for constructing hybrid quantumclassical models in interdisciplinary efficiently. It is designed with following characteristics:

QML application platform. QML is the abbreviation of quantum machine learning. DeepQuantum combines quantum information with machine learning tightly because basic quantum structures are defined by torch variables like torch.tensor and some torch operators. Based on this natural superiority, DeepQuantum could be an ideal application platform for developing various hybrid quantumclassical models and exploring quantum advantages.

Simultaneous parameter optimization.
By leveraging torch calculation diagram, DeepQuantum can construct hybrid quantumclassical models with seamless backpropagation among classical and quantum neural networks. Meanwhile, parameterized hybrid quantumclassical models can be trained and update parameters via classical torch optimizer simultaneously.

Facilitate interdisciplinary. Quantum machine learning is able to combine with various disciplines. DeepQuantum provides powerful tools to promote interdisciplinary such as prediction of binding energy by quantum mutual learning and prediction of gene interaction by quantum variational circuits.
Fidelity. We evaluate our hybrid quantumclassical model by fidelity from quantum information. Gene sequences from spike protein cohort are notated as after quantum encoding while generated spike protein gene sequences are notated as after quantum encoding. We use the following formula1 to calculate the fidelity
(1) 
Also, we can use another form
(2) 
Two kinds of loss functions. We use logistic loss and relativistichinge loss to compare training process between quantum and classical stylebased GAN. Here, we use formulas to elaborate on these two loss functions.
For quantum model, we use nonsaturated logistic loss without gradient penalty as below:
(3) 
(4) 
represents for generator, represents for discriminator, is the latent code and is the true data. For classical model, we add a penalty term to discriminator loss just for successfully training. The penalty term shows as below:
(5) 
In experiment, we take and .
For relativistichinge loss, both quantum and classical stylebased GAN adopt the same form. We first compute two differences as below:
(6) 
(7) 
Then, we construct generator loss and discriminator loss:
(8) 
(9) 
(10) 
represents for the mean value.
References
 [1] W. N. et al., “Open resource of clinical data from patients with pneumonia for the prediction of covid19 outcomes via deep learning,” Nature Biomedical Engineering, vol. 4, no. 12, pp. 1197–1207, 2020.
 [2] G. W. et al., “A deeplearning pipeline for the diagnosis and discrimination of viral, nonviral and covid19 pneumonia from chest xray images,” Nature Biomedical Engineering, vol. 5, no. 6, pp. 509–521, 2021.
 [3] T.H. P. et al., “A deep learning framework for highthroughput mechanismdriven phenotype compound screening and its application to covid19 drug repurposing,” Nature Machine Intelligence, vol. 3, no. 3, pp. 247–257, 2021.
 [4] C. J. et al., “Development and evaluation of an artificial intelligence system for covid19 diagnosis,” Nature Communications, vol. 11, no. 1, p. 5088, 2020.
 [5] Y. G. et al., “Machine learning based early warning system enables accurate mortality risk prediction for covid19,” Nature Communications, vol. 11, no. 1, p. 5033, 2020.
 [6] M. F. et al., “Reprogrammed crisprcas13b suppresses sarscov2 replication and circumvents its mutational escape through mismatch tolerance,” Nature Communications, vol. 12, no. 1, p. 4270, 2021.
 [7] H.Y. H. et al., “Power of data in quantum machine learning,” Nature Communications, vol. 12, no. 1, p. 2631, 2021.
 [8] I. Cong, S. Choi, and M.D. Lukin, “Quantum convolutional neural networks,” Nature Physics, vol. 15, no. 12, pp. 1273–1278, 2019.
 [9] P. W. et al., “Qgan: Quantized generative adversarial networks,” arXiv:1901.08263, 2019.
 [10] T. Karras, S. Laine, and T. Aila, “A stylebased generator architecture for generative adversarial networks,” arXiv:1812.04948, 2018.
 [11] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv:1710.10196, 2017.
 [12] T. K. et al., “Analyzing and improving the image quality of stylegan,” arXiv:1912.04958, 2019.
 [13] S. J. et al., “Parametrized quantum policies for reinforcement learning,” arXiv:2103.05577v2, 2021.
 [14] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv:1701.07875, 2017.
 [15] I. G. et al., “Improved training of wasserstein gans,” arXiv:1704.00028, 2017.
 [16] M. B. et al., “Tensorflow quantum: A software framework for quantum machine learning,” arXiv:2003.02989, 2021.
 [17] F. W. et al., “A new coronavirus associated with human respiratory disease in china,” Nature, vol. 579, no. 7798, pp. 265–269, 2020.
 [18] S. Elbe and G. BucklandMerrett, “Data, disease and diplomacy: Gisaid’s innovative contribution to global health,” Global challenges, vol. 1, no. 1, pp. 33–46, 2017.
 [19] Q. L. et al., “The impact of mutations in sarscov2 spike on viral infectivity and antigenicity,” Cell, vol. 182, no. 5, pp. 1284–1294, 2020.
 [20] B. K. et al., “Tracking changes in sarscov2 spike: evidence that d614g increases infectivity of the covid19 virus,” Cell, vol. 182, no. 4, pp. 812–827, 2020.
 [21] J. A. P. et al., “Spike mutation d614g alters sarscov2 fitness,” Nature, vol. 592, no. 7852, pp. 116–121, 2021.
 [22] B. Z. et al., “Sarscov2 spike d614g change enhances replication and transmission,” Nature, vol. 592, no. 7852, pp. 122–127, 2021.
 [23] A. C. W. et al., “Structure, function, and antigenicity of the sarscov2 spike glycoprotein,” Cell, vol. 181, no. 2, pp. 281–292, 2020.
 [24] M. H. et al., “A multibasic cleavage site in the spike protein of sarscov2 is essential for infection of human lung cells,” Molecular cell, vol. 78, no. 4, pp. 779–784, 2020.
 [25] L. C.C. et al., “Neuropilin1 facilitates sarscov2 cell entry and infectivity,” Science, vol. 370, no. 6518, pp. 856–860, 2020.
 [26] M. H. et al., “Sarscov2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor,” Cell, vol. 181, no. 2, pp. 271–280, 2020.
 [27] B. T. et al., “In situ structural analysis of sarscov2 spike reveals flexibility mediated by three hinges,” Science, vol. 370, no. 6513, pp. 203–208, 2020.
 [28] X. O. et al., “Characterization of spike glycoprotein of sarscov2 on virus entry and its immune crossreactivity with sarscov,” Nature Communications, vol. 11, no. 1, p. 1620, 2020.
 [29] A. K. A. et al., “Immune response to sarscov2 and mechanisms of immunopathological changes in covid19,” Allergy, vol. 75, no. 7, pp. 1564–1581, 2020.
 [30] A. R. B. et al., “Angiotensinconverting enzyme 2 (ace2), sarscov2 and the pathophysiology of coronavirus disease 2019 (covid19),” The Journal of Pathology, vol. 251, no. 3, pp. 228–248, 2020.
 [31] S. C. et al., “Quantum wasserstein generative adversarial networks,” arXiv:1911.00111, 2019.
 [32] J. Romero and A. AspuruGuzik, “Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions,” Advanced Quantum Technologies, vol. 4, no. 1, p. 2000003, 2021.
 [33] J. Z. et al., “Learning and inference on generative adversarial quantum circuits,” Physical Review A, vol. 99, no. 5, p. 052306, 2019.
 [34] C. Zoufal, A. Lucchi, and S. Woerner, “Quantum generative adversarial networks for learning and loading random distributions,” npj Quantum Information, vol. 5, no. 1, p. 103, 2019.
 [35] D. Z. et al., “Training of quantum circuits on a hybrid quantum computer,” Science Advances, vol. 5, no. 10, p. eaaw9918, 2019.
 [36] Y. Du, M.H. Hsieh, and D. Tao, “Efficient online quantum generative adversarial learning algorithms with applications,” arXiv:1904.09602, 2019.
 [37] I. J. G. et al., “Generative adversarial networks,” arXiv:1406.2661, 2014.
 [38] H. S. et al., “Quantum generative adversarial network for generating discrete distribution,” Information Sciences, vol. 538, pp. 193–208, 2020.
 [39] A. A. et al., “Noise robustness and experimental demonstration of a quantum generative adversarial network for continuous distributions,” Advanced Quantum Technologies, vol. 4, no. 5, p. 2000069, 2021.
 [40] H.L. H. et al., “Experimental quantum generative adversarial networks for image generation,” Physical Review Applied, vol. 16, no. 2, p. 024051, 2021.
 [41] Biorender, “Biorender Template for COVID19,” 2021. [Online]. Available: https://app.biorender.com/biorendertemplates
 [42] S. Y. G. et al., “Contribution of single mutations to selected sarscov2 emerging variants spike antigenicity,” Virology, vol. 563, pp. 134–145, 2021.
 [43] J. Z. et al., “Sarscov2 variant prediction and antiviral drug design are enabled by rbd in vitro evolution,” Nature Microbiology, vol. 6, no. 9, p. 1188–1198, 2021.
 [44] S. K. et al., “Omicron and delta variant of sarscov2: A comparative computational study of spike protein,” Journal of Medical Virology, 2021.
 [45] J. Q. et al., “Omicron variant of the sarscov2: a quest to define the consequences of its high mutational load,” Geroscience, 2021.
 [46] C. K. et al., “Point mutations in the s protein connect the sialic acid binding activity with the enteropathogenicity of transmissible gastroenteritis coronavirus,” Journal of Virology, vol. 71, no. 4, pp. 3285–3287, 1997.
 [47] Y. H. et al., “Structural and functional properties of sarscov2 spike protein: potential antivirus drug development for covid19,” Acta Pharmacologica Sinica, vol. 41, no. 9, pp. 1141–1149, 2020.
Supplementary Materials: Quantum Deep Learning for Mutant COVID19 Strain Prediction
StyleGAN. Stylebased GAN(StyleGAN) is considered as a new generative adversarial network that was first proposed by NVIDIA after progressive GAN(ProGAN). Before that, ProGAN is used to solve the quality issues of generating images with high resolution, and its core ideology behind the technique is progressive training. It starts with a lowresolution 4x4 image generator and discriminator and gradually elevates the resolution of the image to develop the model to be able to construct more details and generate images with much higher resolution. However, the limitation of ProGAN is that it does not have the ability to control the specific features of the generated images. Therefore, if the random parameter that is inputted in the model has a slight modification, it will lead to a series of changes or deviations towards the final generated image.
StyleGAN comes into and perfectly solves this problem as it uses different layers to specify one of the random parameter in order to only manipulate a particular feature. Taking a face of human as an example, a rough scaled input can control features like gesture, facial shape, hair style, and glasses; A medium scaled input will be able to control some fine features like details on the hair and the closure of lips; When moving to the high scaled features, it is possible to even accurately control the color of the hair, the color of the skin, and even deblurring of the background to make the image more realistic.
The innovation of StyleGAN focuses more on the generator part. Instead of using latent code in traditional GAN, StyleGAN uses the output from mapping networks and adaptive instance normalization(AdaIN) to replace the bored and repetitive traditional inputs. Besides, it is believed that replacing the selflearned constant with the traditional inputs can improve the entanglement between features and increase the quality of the image.
QCNN based on quantum circuits. Quantum convolution networks mainly include quantum convolution layers, quantum pooling layers and quantum dense layers and consist of rotated x gate, rotated y gate, rotated z gate and CNOT gate. As shown in Fig.S1, our quantum progressive discriminator uses three kinds of QuConv, three kinds of QuPool and one QuDense.
For the first four blocks, we construct them with the first quantum convolution layer defined as , the first quantum pooling layer defined as , quantuminspired blur convolution layer, the second quantum convolution layer defined as , the second quantum pooling layer defined as and a partial trace operation. For the last block, we use the last quantum convolution layer defined as , the last quantum pooling layer defined as and the quantum dense layer defined as .
Group convolution and depthwise convolution. A grouped convolution, a set of convolutions, was first used to distribute over multiple GPUs to help a network learn multiple level features as an engineering compromise but found helpful in improving classification accuracy by ResNet and other models. Moreover, by exposing a new dimension to increase grouped convolutions and the cardinality, which refers to size of the set of transformations, it was examined applicable to increase accuracy.
In depthwise convolution, each filter channel is only at one input channel. In the following example, 3 channel filters are presented. The depthwise convolution is applied here to break the filter and image into three different channels and then coil and roll the corresponding image with corresponding channel. They then will stack on each other back. In order to produce same effect with normal convolution, depthwise convolution is used to select a channel, make all the elements zero in the filter except that channel and then convolve. Three different filters are needed for each channel. Although parameters are remaining the same, this convolution will give out three output channels with only one filter as it is converged from three filters into one.
Classical progressive training. Progressive GAN is an algorithm that StyleGAN generator and discriminator models are used to train their dataset. It can be interpreted as a way of gradually increasing the complexity of the input for the model to fit into. For example, a model dealing images with progressive GAN can start with small images like a 4x4 image. Until the model is fit and stable, both the discriminator and the generator of the model will expanded to twice of the size from its original size, like 8x8. After that, a new block will help building onto it to support this larger image size or dataset size. After a series of stabilization on the model, the model then again will train by expanding its dimension on the generator and discriminator to take over and deal with more complicated images, like 1024x1024.
Original training process of quantum and classical stylebased GAN. In order to present the contrast of training process between quantum and classical models clearly, we made a logarithmic transformation of original loss in the article. We now display the original training process in Fig.S2.