Quantum Deep Learning for Mutant COVID-19 Strain Prediction

by   Yu-Xin Jin, et al.
Shanghai Jiao Tong University

New COVID-19 epidemic strains like Delta and Omicron with increased transmissibility and pathogenicity emerge and spread across the whole world rapidly while causing high mortality during the pandemic period. Early prediction of possible variants (especially spike protein) of COVID-19 epidemic strains based on available mutated SARS-CoV-2 RNA sequences may lead to early prevention and treatment. Here, combining the advantage of quantum and quantum-inspired algorithms with the wide application of deep learning, we propose a development tool named DeepQuantum, and use this software to realize the goal of predicting spike protein variation structure of COVID-19 epidemic strains. In addition, this hybrid quantum-classical model for the first time achieves quantum-inspired blur convolution similar to classical depthwise convolution and also successfully applies quantum progressive training with quantum circuits, both of which guarantee that our model is the quantum counterpart of the famous style-based GAN. The results state that the fidelities of random generating spike protein variation structure are always beyond 96 and converges better with multiple loss functions compared with the corresponding classical algorithm. At last, evidences that quantum-inspired algorithms promote the classical deep learning and hybrid models effectively predict the mutant strains are strong.


Statistical challenges in the analysis of sequence and structure data for the COVID-19 spike protein

As the major target of many vaccines and neutralizing antibodies against...

QFold: Quantum Walks and Deep Learning to Solve Protein Folding

We develop quantum computational tools to predict how proteins fold in 3...

Semiconductor Defect Detection by Hybrid Classical-Quantum Deep Learning

With the rapid development of artificial intelligence and autonomous dri...

Quantum Machine Learning Framework for Virtual Screening in Drug Discovery: a Prospective Quantum Advantage

Machine Learning (ML) for Ligand Based Virtual Screening (LB-VS) is an i...

No increase in COVID-19 mortality after the 2020 primary elections in the USA

We examined the impact of voting on the spread of COVID-19 after the US ...

Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning

The SARS-CoV-2 pandemic has created a global race for a cure. One approa...


Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causal pathogen of coronavirus disease 2019 (COVID-19), has infected over 365 million people and caused over 5.65 million deaths worldwide as of January 28, 2022[5]. Even though numerous vaccines have been proven efficient to suppress the severity and infectivity of the virus, multiple new SARS-CoV-2 variants emerge rapidly and put great therapeutic pressure. As reported, B.1.617.2 (Indian, Delta) strain is now the global dominant variant of concern, which shows stronger transmissibility and causes higher risk of hospitalization, intensive care and death[6]. Moreover, B.1.617.2 strain weakens the antibody’s ability to neutralize the virus and increases the possibility for convalescent patients to reinfect. In December 2020, the emergence of B.1.1.529 (South African, Omicron) dramatically boosts the spreading of the virus and was proven to cause a significant decrement on the efficacy of vaccines. Other variants of concern including B.1.1.7 (British, Alpha), B.1.351 (South African, Beta), and P.1 (Brazilian, Gamma) also create chaos for the global pandemic control. Therefore, there is an urgent need to predict the variation structure of COVID-19 epidemic strains like Delta and Omicron in order to take action of early prevention and quick treatment, thus contributing to the global pandemic control[1, 2, 3, 4].

Quantum computation[7]

has attracted considerable interest in both academia and industry and been widely applied to the field of machine learning. Since near-term quantum devices are still fairly noisy, researchers mainly focus investigations on various possible hybrid quantum-classical machine learning algorithms


. As an accelerator in heterogeneous computing architecture, photonic and quantum processors directly improve the performance of artificial neural network and play the role of domain specific processors. In order to bring the advantages of quantum processor into a wide range of deep learning applications, it is necessary to develop a high-level library and other compiler software. Our self-developed DeepQuantum aims to make it easy to develop quantum applications and get the help of artificial intelligence community. On that foundation, hybrid quantum-classical models including variational quantum eigensolvers, quantum convolutional neural networks


, quantum generative adversarial networks(GAN)

[9, 32]

and quantum reinforcement learning

[13] have been developed. Additionally, quantum style-based generative adversarial networks will be proposed to better predict COVID-19 epidemic strains.

Style-based generative adversarial networks[37, 10, 12] are successfully applied to generate high-resolution realistic images[40] with training dataset style and enable intuitive, scale-specific control of the synthesis. We introduce a hybrid quantum-classical model associating scale-specific control ability with quantum inherent accelerating performance[36] to optimize the training process of classic GAN. With the capability of feature control, style mixing is employed by switching from one latent code to another at the selected point in the synthesis network, thus generating a feature map equipped with mixing features.

To the greatest extent of maintaining quantum discriminator structure in our quantum style-based GAN, we also establish an operation called quantum-inspired blur convolution which is analogous to depthwise convolution in classical machine learning and it achieves quantum progressive training[11]. Then, we turn the discriminator of style-based GAN into its quantum counterpart and adapt the original generator for SARS-CoV-2 RNAs. We compare classical style-based GAN and quantum style-based GAN by recording the training process with multiple loss functions[14, 15, 31] and results show that the training of the quantum model is more stable, converges better as well as flexible to loss functions. Meanwhile, we use fidelity to evaluate the performance of predicting epidemic strains and the results are always over 96 for Delta, 94

for Omicron according to generated samples. Quantum style-based GAN, as an example, predicts the mutation probability of coronavirus strains by mining the past SARS-CoV-2 mutation RNA sequences, provides an effective way to realize the application of quantum-classical hybrid algorithms, and shows the advantages of quantum algorithms combined with deep learning.

Results and Discussion

Datasets of mutated SARS-CoV-2 RNAs. To collect various and essential SARS-CoV-2 mutation characteristics for quantum style-based GAN traning, we include 774 mutated SARS-CoV-2 complete genomes from 9 variants of concern consisting of B.1.1.7, B.1.351, B.1.429, B.1.525, B.1.526, B.1.617.1, P2, B.1.617.2 and B.1.1.529. Data come from publicly available databases GISAID[18] and NCBI[17]. We construct spike protein cohort by aligning complete genomes, intercepting spike protein fragment and squeezing duplicate sequences (described in “Methods”). In the hybrid quantum-classical model, we use 734 samples from the first 8 variants as the training cohort for generating and predicting Delta mutational sites, and add 40 Omicron samples to the previous cohort as the training cohort of Omicron mutational sites. The statistical result of the mutation frequency plot is split into four parts based on the spike protein position.

Quantum encoding. To extract variation features of mutated SARS-CoV-2 RNAs and fit for the input of the hybrid quantum-classical model, we employ quantum encoding[39]

on the spike protein cohort. For each spike region gene sequence in the cohort, we compare it with the first identified SARS-CoV-2 numbered NC_045512.2 in the corresponding position. If the above two RNA base sequences are different at a given locus, we mark 1 at the same position in a new vector and mark 0 if the opposite happens. Thus, we get a series of vectors containing number 0 and 1 in the size of 1024 (since we start training from 10 qubits).

Density matrix is an Hermite operator which needs to satisfy normalization, trace equal to 1 and semi-positive requirement. To meet these conditions, we first perform L2 normalization on the stated vector, then we compute its conjugate transposed vector and finally we multiply them in order to build a 10-qubit density matrix. For each spike region gene sequence we generate, similar quantum encoding is conducted(described in “Methods”).

The algorithm flow of quantum style-based GAN We propose a hybrid quantum-classical model QuStyleGAN(quantum style-based GAN)[34, 38] for COVID-19 epidemic strain prediction, which directly takes spike protein cohort as the training dataset, to generate variation structure possessing SARS-CoV-2 mutation characteristics and then map it to a spike region gene sequence from one of Delta or Omicron strain (details described in”Methods”). The model comprises two key parts. One is the classical style-based generator and the other is the quantum progressive discriminator.

The classical style-based generator (Supplementary Fig.S1d) contains a mapping network consisting of 8 fully connected layers and a synthesis network consisting of 5 blocks ranging from resolution to in steps of , two convolution layers for each block. Detailed description is shown in Supplementary style-based GAN. The quantum progressive discriminator will be introduced later.

The workflow of quantum style-based GAN is shown in Fig.1. We first employ quantum encoding on the spike protein cohort to get true RNA mutation sequences. At the same time, the classical style-based generator generates fake RNA mutation sequences. Then, two kinds of data are fed in the quantum progressive discriminator to get their true or false judgment scores. According to the loss function we compute, we update parameters both in the classical generator and the quantum discriminator. Thus, one-step model training has completed.

Quantum progressive discriminator and training. The quantum progressive discriminator (Fig.2a, Supplementary Fig.S1 and QCNN) contains 5 quantum circuit modules ranging from 10 qubits to 2 qubits in steps of 2 qubits[33, 35]

. For each of the first four modules, we construct two quantum convolution layers, two quantum pooling layers and one quantum-inspired blur convolution layer (shown in next section) with the corresponding number of qubits. Each quantum layer based on quantum gates corresponds to a unitary matrix. We put these unitary matrices in order as displayed in Fig.

2a. Consequent operation after aforementioned unitary operators is partial trace represented by 2 qubits observation symbols. Partial trace trashes the information of 2 qubits in the density matrix while saving others, which achieves the reduction of density matrix in steps of 2 qubits. The last module of the quantum progressive discriminator contains a quantum convolution layer, a quantum pooling layer, a quantum dense layer and a 1-qubit observation. The output of this module will be the final discriminative result.

Quantum style-based GAN training adopts progressive growing methodology same as classical style-based GAN algorithm(Supplementary Classical progressive training). By progressively introducing pre-configured quantum layers, we start training with the 2-qubit block, and then add larger 2-qubit blocks to the networks as visualized in Fig.2b. In order to fade new quantum layers in smoothly, we follow residual block operation and downscale density matrices to match the current 2-qubit block. Different from conducting average pooling to downscale in classical progressive training, we use partial trace to downscale between 2-qubit density matrices in quantum progressive training. Benefits of quantum progressive training include simplifying the original generation problem of mapping latent codes to spike protein variation structures, stabilizing training process and reducing training time as suggested in ProGAN[11].

By contrasting quantum and classical style-based GAN with progressive training process recorded by two kinds of loss functions (details in ”Methods” and Fig.S2) every 74 steps, we may discover some advantages of quantum neural networks. In total, Fig.2c and Fig.2d illustrate that quantum style-based GAN(QuStyleGAN) training loss is more stable and converges better than classical style-based GAN(StyleGAN). For logistic loss, we can see that classical generator and discriminator loss alternatively oscillate and never reach convergence during the whole 3500 steps. In contrast, quantum generator and discriminator loss first intensively oscillate but converge after 1000 steps and sustain slight oscillation. Similar situation happens in relativistic-hinge loss, we can find more apparent convergence in QuStyleGAN after 1500 steps. These findings confirm the power of quantum neural networks and the hybrid quantum-classical model could potentially yield a quantum advantage.

Quantum-inspired blur convolution.

While designing the corresponding quantum discriminator, there is a blur layer benefit for noise reduction and feature extraction in its classical counterpart. After a careful study, the blur layer is actually a group convolution with the same number of groups, input channels and output channels, also called depthwise convolution (Supplementary Group convolution and depthwise convolution). Depthwise convolution can reduce many parameters, make effective use of hardware resources, and finally concatenate all feature maps. Based on these features and quantum properties, we propose a new quantum layer called quantum-inspired blur convolution.

As shown in Fig.3b, conducting a quantum-inspired blur convolution needs three key steps, including partition, quantum blur convolution and concatenate operation. For simplicity, we take a feature map with the size of 4 qubits (any 2 qubits can be applied in the same way) as example. For the partition step, we first divide the feature map into some feature maps; then we quantum encode these feature maps into some 2-qubit density matrices. For the quantum blur convolution step, we first construct a 2-qubit parameterized quantum circuits as the quantum blur convolution kernel; then we use the kernel to evolve with density matrices. For the last concatenate step, we put evolved density matrices into a feature map; then we quantum encode the feature map into a 4-qubit density matrix. Thus, a complete quantum-inspired blur convolution is done.

Since this quantum-inspired algorithm is an analog of depthwise convolution, it’s possible to take a brief comparison between them. With the perspective of the whole process, as shown in the right part of Fig.3

a, input tensor is divided into

feature maps and then each feature map conducts convolution using a single kernel; finally, all output feature maps are concatenated as the output tensor with the size . Obviously, as stated above, this algorithm flow has three corresponding steps to depthwise convolution. From another point view of reducing parameters and saving resources, the left part of Fig.3a shows group convolution with parameters where is the number of groups while depthwise convolution only needs parameters. Similarly, in this case as displayed in Fig.3b, the quantum-inspired module can be conducted in a 2-qubit parameterized quantum circuits with 5 parameters while normal quantum convolution has to be operated on a 4-qubit parameterized quantum circuits with more parameters. Therefore, as the expansion of quantum circuits consumes computing resources exponentially, quantum-inspired blur convolution in fact saves the computing resources and storage space greatly.

Quantum model evaluation. The above quantum structure innovations are applied to the components of our quantum style-based GAN. First, we successfully solve quantum encoding of gene sequences. No matter what format the data is, vector or matrix, we both develop a method to encode the data to a density matrix with the corresponding size. This method may inspire further research. Second, quantum progressive training is perfectly carried out in our model. By training the quantum block-wise network progressively, we can save much training time and enable stable training process. It also suggests a possible way to train a quantum network with relatively large qubits. Third, we design a new quantum layer called quantum-inspired blur convolution. This quantum-inspired blur convolution layer reduces parameters, saves computing resources and serves as depthwise convolution in quantum machine learning expecting having the ability of noise reduction and feature extraction. This creative construction not only corresponds to classical style-based GAN, but also can be widely used like a normal quantum layer in later research.

Quantitatively, we evaluate our self-developed quantum GAN for variation structure prediction with spike protein cohort by fidelity. We randomly choose ten generated variation structures and ten gene sequences in spike protein cohort to compute the fidelity heatmap. Although it suggests the number of quantum discriminator parameters is far less than that in classical discriminator, the heatmap indicates that all fidelities of selected generated sequences are over 96 for Delta (Fig.3d), and 94 for Omicron (Fig.3e).

As shown in Fig.3d, the darkest blue line lies in the fidelities between ten random generated Delta spike protein sequences and OU007056.1 from strain P2, which means these generated sequences have the same highest fidelity among the heatmap and they are more similar to OU007056.1. The lightest blue spatial parts show the lowest fidelity in the heatmap, which mainly concentrate in the fidelities between the forth, eighth and ninth generated sequences and OD934760.1 from B.1.17 as well as OD979867.1 from B.1.525. However, high fidelity could be the result of little mutation compared with the original SARS-CoV-2 and the opposite situation may contribute to low fidelity. Additionally, we can make conclusions about horizontal lines, for instance, the sixth generated sequences is more similar to MW638371.1 from B.1.429 than others in the second line.

In the same way, Fig.3e suggests the highest fidelity appears in the cross point between the 2nd generated Omicron SP sequences and MW708379.1 from B.1.526 while the lowest fidelity exits in the forth line with OD934709.1 from B.1.17.

Performance evaluation of generated spike protein variation structure. Apart from the fidelity between our self-developed model generated spike protein sequence and the selected variants spike protein sequence, a total of 1000 output sequences generated from the model are presented to show their specific mutation of positions in both the whole sequence and the spike protein sequence[19]. From Fig.1a, Delta variant and Omicron variant coexist some single mutational sites but still follow its own distinctive pattern of mutation. Single nucleotide substitutions detected the already existing Delta and Omicron samples on the spike protein sequence region are accumulated around 300-500 as the first peak and around 2000 as the second peak. From Fig.4, the positions with high frequencies in the generated diagram emerges on several peaks at the range of 300-500 and 3300-3500. It has a relatively similar pattern at the first peak with the existing samples, but the newly generated sequences reveal a minor deviation from the second peak positional information of the mutation.

From the mutational frequency result of the existing Delta variant sequences, the peak of the position at 300-500 in the nucleotide sequence is located in the region of the S1 upstream to the RBD region[20, 21], and the peak around 2000 in the nucleotide sequence is located at the RBD region of the spike protein[1, 22]. Within the RBD, the spike protein mutation on amino acid position 501 has the potential to increase ACE2-binding affinity by increasing the time of the “open conformation” of the ACE2 binding receptor. The mutation on amino acid position 519 is believed to decrease the convalescent serum neutralization in order to affect the efficiency of the vaccine and antibody detection in the body. The mutation on amino acid position 614 shows an increment on the viral load and produces more infectious viral individuals in human in vivo[23, 24]. Each of the mutations are proven with quantitative analysis on the ability to increase infectivity or avoidance of human immune system[25, 26, 27].

From the mutational frequency result of the Omicron of the input samples, Omicron contains more amino acid deletions and mutational sites on the spike protein than Delta[42, 43]. Omicron has the same mutational sites on 478 and 501, which gives this variant the ability to increase ACE2 binding affinity. The extra mutation on 484 has the potential to cause immune escape, and the deletion on 69 and 70 is responsible for the cause of S-gene target failure (SGTF)[43].

On the biological perspective to evalute the model’s effectiveness, we use information of generated sequences from the model to develop an algorithm on R programming for integrating datas into a visual presentation of the mutational frequency for each potential site using bar plots. Based on the significant mutation on positions in the whole sequences and spike protein sequences, the difference or similarity between the generated and existing variants will be deducted and concluded to prove the possibility of the predicted variants and the variability of the model[28, 36]. As from the result frequency map, 300-500(Fig.4a and 4e), 3300-3500(Fig.4d and 4h) are considered as the most likely range with a single nuleotide variation.

Based on the result of generated sequences on Delta variant in Fig.4a, the most favorable sites of mutation occurs at the range of 300-500 among the generated sequences. Around this range of position in the nucleotide sequence, it is located in the region of the S1 upstream to the RBD region[20, 21], which indicates that 300-500 has no clear structural function and does not participate the construction of the protein[29, 30]. The plausible reason is that this region of the nucleotide is on the upstream of the S1 region and might work as nucleotide that translates into regulatory proteins.

After adding 40 omicron sequences into the 734 variant samples and running the model again, even though the input frequency map occurs no obvious transformation, the result of the generated sequence provides a slight fluctuation of the mutational sites on the spike protein sequence. In Fig.4e, besides the frequent mutational occurrence between 300-500, sequence at the start of the spike protein to 300 has the inclination of getting mutated. For sequences between 0 to 300 in the S1 region, it refers to the N terminal domain, which its role is still not well marked[44, 45]. However, the ability of identifying specific carbohydrate portion in human cells during the early attachment makes it a suspicious factor of increasing ACE2 binding affinity in human cells[46].

As shown in Fig.4b and 4f, there is no significant mutational sites appearing between 1000 and 2000, which is consistent with the actual situation.

Based on Fig.4c and 4g, the generated sequences for Delta and Omicron both have highly frequent mutations on the second half of the circle, which is around 2400 - 2800. The corresponding protein sequence is in the section of FP, HR1, and HR2 in S2 subunit that is presented in Fig.1a. FP region contains a high content of glycine and alanine, which combines with HR1 and HR2 to allow a better anchoring and viral entry into the host cell after RBD binds to the cell.

On Fig.4d and 4h, both Delta and Omicron generate new mutation sites between 3300 to the end of the spike protein sequence, which refers to the section of transmembrane domain and cytoplasmic tail domain. Both parts are believed to be highly conservative and play an important role throughout relative coronavirus[47]. However, based of insufficient informations and limited cognition towards SARS-CoV-2, it is hardly to make an arbitrary conclusion on whether the mutation is going to decrease infectivity of the virus or deform its mechanism to cause more severe damage.

As mentioned above, positions with higher frequency (within 300-500 and 3300-3500) exhibited by the prediction model is a substantial indication of the future evolution on the COVID-19 variants as the pandemic continues. With the assistance of the modeling, we had accomplished a task by providing a considerable significance on predicting possible mutations from a novel perspective. This result may shed light upon research on future trends of the mutation and can be applied to other virus strains. Provided a complete understanding on the function of each single region in the spike protein, and a more comprehensive knowledge towards the protein folding and other biological form change, our model is compatible to such additional knowledge and hence will become more complete and vigorous.

In this work, we build a quantum style-based GAN for predicting spike protein variation structure of COVID-19 epidemic strains with mutated SARS-CoV-2 RNA sequences. The model to predict future COVID-19 variants with distinctive differences on mutational sites after adding Omicron samples helps convince the practicability and exclude the disturbance of randomness. The performance also shows great potential of quantum machine learning. Using metrics in quantum information, we get high fidelity as shown in Fig.3, which usually surpasses 96 for Delta, 94 for Omicron. It means our generated variation structures of COVID-19 epidemic strains equip with both quantum model and biological significance. Furthermore, we obtain a stable and convergent quantum GAN training process which is not common in classical model.

There are also some limitations and future work for this hybrid quantum-classical model. First, more mutated SARS-CoV-2 RNAs may be better for model training and it’s accessible as existence of multiple variants. Second, more sophisticated quantum encoding can be completed in the future. Third, we can replace existing mapping method with inserting Delta or Omicron cohort in a 10-qubit module in the principle of style mixing. Finally, we might also modify model generator to fit for gene sequences directly by leveraging transformer.

Fortunately, construction of framework DeepQuantum accelerates the development of our model. This framework serves as quantum version of PyTorch, which combines quantum theory with deep learning tightly. With the help of DeepQuantum, researchers could exploit more hybrid quantum-classical models in various application scenarios and bring the advantage of quantum algorithm into deep learning application.


In summary, quantum style-based GAN can successfully predict variation structure of COVID-19 epidemic strains and is promising to be applied to other zoonotic viruses. Based on the result, the model generates both mutational sites in a frequency table with and without the addition of omicron samples respectively and demonstrates a clear differentiation with their distinctive characteristics. This model also includes some quantum innovations such as quantum progressive training and quantum-inspired blur convolution, validates the power of quantum networks because of high fidelities and stable training process.

More sophisticated conclusions may lie in the well combination of quantum algorithms and classical artificial intelligence. Our proposed model stabilizes GAN training process, which provides an example of promoting artificial intelligence(AI) with quantum computation. Moreover, with the help of quantum computation, some problems can be solved while it is difficult for classical deep learning methods. Also, the elapsed time of quantum algorithms with the same methodology and effect as classical modules reduce greatly. In addition, the participation of AI in quantum computation naturally enriches solutions to problems in quantum fields.

At last, DeepQuantum works as a simplest tool combining the torch programming of deep learning and the qiskit-style programming of quantum computing. It could be an effective bridge to communicate quantum computation and artificial intelligence, thus contributing to the development and construction of hybrid quantum-classical models with the advantage of quantum algorithm.


Strain selection and alignment. Before picking up appropriated variant strains, a wild-type sequence called severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, which was first recorded in March 2020, was used as a reference sequence for analysis and alignment. NCBI has documented this sequence and named it as NC045512. The prevalence of the COVID 19 cases has been boosted in the past few months with the emergence of two new VOCs, Delta and Omicron. Those two variants are specifically chosen for generating the prediction of their spike protein mutation frequency due to their accelerating spreading and relatively high hazard worldwidely. Based on Fig.1a, both of the variants contain a considerable amount of mutational sites that are known to strengthen the virus. We used GSA and GISAID databases accessing from May 30 to June 29 in 2021 to derive 734 mutated SARS-CoV-2 sequences from 8 different variants[17, 18]. After the explosion of the Omicron variant, we derived 40 good-quality Omicron samples and added them into the training set as the second prediction model. In order to consider the accuracy and precision, collected sequences need to be ensured with their completeness and have multiple reviews.

CoVariant and CovidCG are used to provide integrated and pre-analyzed data in good classified format to help us simplify our own data processing. Covariant, supported by the database of GISAID from the website, is used as a source of detailed description and protein model presentation on each of the variant

[18]. Single nucleotide variants frequency on the nucleotide sequence and its corresponding amino acid are presented as a bar plot diagram by CovidCG website. Co-occurence of different mutational sites are shown in CovidCG to provide a thorough interactive relationship between different sites.

Statistical computing software is used to do data analysis and investigate the relationship or summary of our data within our intention. The frequency distribution of the mutational sites within the selected sequences are measured in a bar-plot using R programming with an algorithm to selectively pick positions with a high frequency by setting thresholds. The selected genome was aligned using clustalo with multi-alignment setting to obtain the relative probability of getting a mutation in a particular position for the construction of prediction model.

Interception and compression of spike protein gene sequences. We construct spike protein cohort to train the proposed hybrid quantum-classical model. After alignment with the first identified SARS-CoV-2 numbered NC_045512.2, we intercept the information of 21563-25384 loci in 774 mutated SARS-CoV-2 complete genomes as spike protein gene sequences.

It’s easy to compute that each spike protein gene sequence contains 3822 loci, which is inconsistent with the size of 10-qubits as input. Therefore, we compress some redundant information to get final spike protein gene sequences with the size of 1024 matching the input of quantum networks. We first fix all loci with mutation by comparing spike protein gene sequences with NC_045512.2 one by one. Then, we reserve positions adjacent to mutation loci. Finally, 1024 loci are supplemental by randomly selecting other locations and we record all the chosen positions. So far, we get spike protein cohort of 1024 loci.

Quantum encoding for generated feature maps. For spike protein gene sequences as training cohort, we simply operate quantum encoding on vectors. However, for generated feature maps produced by quantum style-based GAN generator, we have more steps to do. Firstly, we need to take the diagonal of each generated feature map. By taking a square root of the diagonal, we get a generated vector and it can be applied to quantum encoding as previously described. In brief, we conduct L2 normalization and then multiply conjugate transposed form in order.

Variation structure generating and mapping. Variation structure information learned by quantum style-based GAN is stored in generated feature maps. In order to extract accurate and reliable mutation information as well as meet biological significance, we take the following action.

By investigating similarity between 774 mutated SARS-CoV-2 spike protein gene sequences and NC_045512.2, we suppose 98%(97%) as the similarity of generated Delta(Omicron) spike protein gene sequences. In other words, in each generated spike protein gene sequence, only 21(31) loci mutate and other loci remain the same as NC_045512.2. To obtain these mutation positions, we first get a probability vector by extracting square root of the diagonal of each generated feature map. Next, we sort values of this probability vector and choose top 21(31) indices as mutation positions. Yet, this is insufficient unless we return these mutation positions to original spike protein gene sequence with 3822 loci or even the complete genome. In fact, we have already done such jobs as we give mutation positions of the above three types(1024, 3822 and whole) for a generated variation structure.

Mapping generated variation structure to COVID-19 epidemic strains follows a principle similar to the above. Frankly, we randomly choose some aligned spike protein gene sequences of COVID-19 epidemic strain Delta or Omicron and then map generated variation structure onto them in the sense of mutation positions. In this way, we obtain some new mutated gene sequences possessed of variation characteristics of SARS-CoV-2 variants.

Torch-based quantum deep learning. We implement our hybrid quantum-classical model QuStyleGAN by self-built torch-based software framework: DeepQuantum. DeepQuantum is established in the aim of combining quantum computing with machine learning and providing a powerful tool for constructing hybrid quantum-classical models in interdisciplinary efficiently. It is designed with following characteristics:

  • QML application platform. QML is the abbreviation of quantum machine learning. DeepQuantum combines quantum information with machine learning tightly because basic quantum structures are defined by torch variables like torch.tensor and some torch operators. Based on this natural superiority, DeepQuantum could be an ideal application platform for developing various hybrid quantum-classical models and exploring quantum advantages.

  • Simultaneous parameter optimization.

    By leveraging torch calculation diagram, DeepQuantum can construct hybrid quantum-classical models with seamless backpropagation among classical and quantum neural networks. Meanwhile, parameterized hybrid quantum-classical models can be trained and update parameters via classical torch optimizer simultaneously.

  • Facilitate interdisciplinary. Quantum machine learning is able to combine with various disciplines. DeepQuantum provides powerful tools to promote interdisciplinary such as prediction of binding energy by quantum mutual learning and prediction of gene interaction by quantum variational circuits.

Fidelity. We evaluate our hybrid quantum-classical model by fidelity from quantum information. Gene sequences from spike protein cohort are notated as after quantum encoding while generated spike protein gene sequences are notated as after quantum encoding. We use the following formula1 to calculate the fidelity


Also, we can use another form


Two kinds of loss functions. We use logistic loss and relativistic-hinge loss to compare training process between quantum and classical style-based GAN. Here, we use formulas to elaborate on these two loss functions.

For quantum model, we use non-saturated logistic loss without gradient penalty as below:


represents for generator, represents for discriminator, is the latent code and is the true data. For classical model, we add a penalty term to discriminator loss just for successfully training. The penalty term shows as below:


In experiment, we take and .

For relativistic-hinge loss, both quantum and classical style-based GAN adopt the same form. We first compute two differences as below:


Then, we construct generator loss and discriminator loss:


represents for the mean value.


Supplementary Materials: Quantum Deep Learning for Mutant COVID-19 Strain Prediction

StyleGAN. Style-based GAN(StyleGAN) is considered as a new generative adversarial network that was first proposed by NVIDIA after progressive GAN(ProGAN). Before that, ProGAN is used to solve the quality issues of generating images with high resolution, and its core ideology behind the technique is progressive training. It starts with a low-resolution 4x4 image generator and discriminator and gradually elevates the resolution of the image to develop the model to be able to construct more details and generate images with much higher resolution. However, the limitation of ProGAN is that it does not have the ability to control the specific features of the generated images. Therefore, if the random parameter that is inputted in the model has a slight modification, it will lead to a series of changes or deviations towards the final generated image.

StyleGAN comes into and perfectly solves this problem as it uses different layers to specify one of the random parameter in order to only manipulate a particular feature. Taking a face of human as an example, a rough scaled input can control features like gesture, facial shape, hair style, and glasses; A medium scaled input will be able to control some fine features like details on the hair and the closure of lips; When moving to the high scaled features, it is possible to even accurately control the color of the hair, the color of the skin, and even deblurring of the background to make the image more realistic.

The innovation of StyleGAN focuses more on the generator part. Instead of using latent code in traditional GAN, StyleGAN uses the output from mapping networks and adaptive instance normalization(AdaIN) to replace the bored and repetitive traditional inputs. Besides, it is believed that replacing the self-learned constant with the traditional inputs can improve the entanglement between features and increase the quality of the image.

Figure S1: Detailed quantum style-based GAN. a 2-qubit QuConv. b 2-qubit QuPool. c 2-qubit Qudense. d StyleGAN Generator. e Style Mixing RNA. f Quantum Progressive Discriminator.

QCNN based on quantum circuits. Quantum convolution networks mainly include quantum convolution layers, quantum pooling layers and quantum dense layers and consist of rotated x gate, rotated y gate, rotated z gate and CNOT gate. As shown in Fig.S1, our quantum progressive discriminator uses three kinds of QuConv, three kinds of QuPool and one QuDense.

For the first four blocks, we construct them with the first quantum convolution layer defined as , the first quantum pooling layer defined as , quantum-inspired blur convolution layer, the second quantum convolution layer defined as , the second quantum pooling layer defined as and a partial trace operation. For the last block, we use the last quantum convolution layer defined as , the last quantum pooling layer defined as and the quantum dense layer defined as .

Group convolution and depthwise convolution. A grouped convolution, a set of convolutions, was first used to distribute over multiple GPUs to help a network learn multiple level features as an engineering compromise but found helpful in improving classification accuracy by ResNet and other models. Moreover, by exposing a new dimension to increase grouped convolutions and the cardinality, which refers to size of the set of transformations, it was examined applicable to increase accuracy.

In depth-wise convolution, each filter channel is only at one input channel. In the following example, 3 channel filters are presented. The depth-wise convolution is applied here to break the filter and image into three different channels and then coil and roll the corresponding image with corresponding channel. They then will stack on each other back. In order to produce same effect with normal convolution, depth-wise convolution is used to select a channel, make all the elements zero in the filter except that channel and then convolve. Three different filters are needed for each channel. Although parameters are remaining the same, this convolution will give out three output channels with only one filter as it is converged from three filters into one.

Figure S2: Original training process of quantum and classical style-based GAN using logistic loss and relativistic-hinge loss.

Classical progressive training. Progressive GAN is an algorithm that StyleGAN generator and discriminator models are used to train their dataset. It can be interpreted as a way of gradually increasing the complexity of the input for the model to fit into. For example, a model dealing images with progressive GAN can start with small images like a 4x4 image. Until the model is fit and stable, both the discriminator and the generator of the model will expanded to twice of the size from its original size, like 8x8. After that, a new block will help building onto it to support this larger image size or dataset size. After a series of stabilization on the model, the model then again will train by expanding its dimension on the generator and discriminator to take over and deal with more complicated images, like 1024x1024.

Original training process of quantum and classical style-based GAN. In order to present the contrast of training process between quantum and classical models clearly, we made a logarithmic transformation of original loss in the article. We now display the original training process in Fig.S2.