I Introduction
With quantum supremacy and recent advance in quantum computing, the quantum algorithm is attracted as a solution to various computationintensive problems, where vast information can be processed in a faster manner [du2015spl, xiong2022spl]. Indeed, Shor algorithm [Shor97] can reduce computations of factorization by quantum algorithm, and Grover search [Grover96]
reduces the searching cost, which is proven quantum supremacy by the real quantum device. Furthermore, IBM Quantum has publicized the 2022 development roadmap that 10–100k qubits quantum computers will be developed until 2026
[roadmap2022], and more quantum algorithms will prove their supremacy by the advance of quantum computers. Therefore, the trends of quantum computing can be selected as a strong candidate for nextgeneration computing, and this leads to research innovation in quantum machine learning (QML).Recent studies have started reimplementing existing ML applications using QNNs, e.g., image classification [baek2022scalable]
[yun2022quantum, icdcs2022yun], federated learning (FL) [yun2022slimmable, KWAK2022]. Especially, quantum federated learning (QFL) is widely used in QML because distributed learning can be implemented with quantum communications [xu2022privacy, li2021quantum]. Motivated by these trends, and split learning (SL) studies have shown its success in both privacypreserving [DSN21], we extend the classical way to the quantum SL (QSL) framework. The naïve approach of the QSL framework can be reimplemented to a quantum version of classical SL. However, a naïve approach cannot deal with the quantum nature. According to [baek2022scalable, baek2022sqcnn3d, baek2022fv], the features obtained from QCNNs with naïve training are similar when input is given to the QCNNs. To cope with this problem, reverse fidelity training has been trained, which utilizes the quantum nature. When QSL is designed, the quantum nature helps QML and gives insights.We aim to propose an extension of SL to QSL by replacing NNs with QNNs [DSN21]
. In addition, we propose a crosschannel pooling (C2Pool) that utilizes the expected value of probability amplitude, which is the nature of QNN’s output. We empirically show that leveraging C2Pool achieves higher accuracy, the lowest communication cost, and privacy preservation. The contributions of this paper are trifolded, i.e., (1) We first propose the quantum split learning framework, (2) The proposed framework has considered the characteristic of quantum nature; and (3) The numerical experiments show that our framework outperforms naïve QSL or QFL.
Ii Related Work
Iia Basic Quantum Gates and Quantum Operations
We introduce the basic quantum gates. In a single qubit system, the quantum state is defined with two bases and probability amplitude in the Hilbert space as , where . The qubits quantum state is written as , where and stand for the probability amplitude and th standard basis in Hilbert space, respectively. By the definition of Hilbert space (i.e., ), probability amplitude has following relationship, i.e., . The rotation and controlled
gates linearly transform the probability amplitude, which are denoted as
(1)  
where , , and denote Pauli, , and gate, respectively. By leveraging these basic gates, we can represent a universal unitary gate , which is a key element of QNN.
IiB Quantum Neural Network
We introduce the operation of quantum neural networks. As shown in Fig. 1, QNN is tripartite: the state encoder, quantum neural layer, and measurement [qoc2022wang].
State Encoder. Since uploading classical data to quantum circuits is challenging, the encoding methods for QNNs have been discussed such as basis encoding, amplitude encoding, and angle encoding. Among these methods, we introduce an advanced angle encoding technique, i.e., datareuploading. Given the classical data , the datauploading process is expressed as follows,
(2) 
where denotes the initial quantum state, i.e., . The input data are splitted into with the interval of , , where and
denote the input vector which is composed by the first
to elements and the trainable parameters, .Quantum Neural Layer. Quantum neural layers play a role in processing the quantum state
. It can be utilized as a quantum convolutional neural network (QCNN) and a quantum fully connected network (QFCN). The rotation gates and controlled gates (referred to (
1)) are utilized to compose the quantum neural layers. In the realquantum device, designing a quantum neural layer is significant [quantumnas, qoc2022wang], because entangling qubits is challenging. However, simulating quantum circuits with classical computers is available. Thus, the quantum neural layer can be considered as a unitary operation, i.e., .Measurement. To obtain the classical output, the quantum state should be projected to measure device. We introduce the measurement methods, i.e., projectionvalued measure (PVM), positive operatorvalued measure (POVM), and the trainable measurement method [simeone2022introduction, yun2022slimmable, yun2022quantum, yun2022projection].

[leftmargin=10pt]

PVM: The PVM makes a probability measure by projecting the quantum state into projector , and the output can be expressed as follows,
(3) PVM is known as a complete measurement, because it projects the quantum state into all possible bases.

POVM: Usually, the quantum state is projected into Pauli measure devices for every qubit [Qiskit], where Pauli is . The th projection matrix is defined as . In POVM, it makes the expectation value of a projection
as random variable denoted as
where . Its expectation value of th qubit is . POVM is known as a incomplete measurement, because it projects the quantum state into partial projection matrices. 
Trainable Measurement: Trainable measurement is suggested by [lloyd2020quantum, schuld2022quantum]. To realize the trainable measurement, the concept of pole is proposed by implementing multiagent reinforcement learning [yun2022quantum], and federated learning [yun2022slimmable].
IiC Optimizing Methods for QNN
We introduce a QNN training method, i.e., a classicalquantum hybrid method. Since optimizing QNN has a backward quantum propagation of phase errors (Baqprop) principle, classical computing is required to obtain the gradient of the objective function, known as a classicalquantum hybrid method. It consists of two processes: (1) gradient calculation via classical computing and (2) parametershift rule [qoc2022wang]
. Suppose that the loss function is denoted as
where is the QNN parameters. The loss gradient of th QNN parameter is written as , where denotes the measured output. The classical computer calculates and the term is calculated by quantum computers and forward propagation as follows,(4) 
where is onehot vector that eliminate all components except , i.e., .
IiD Recent Studies on Quantum Distributed Learning
We introduce the recent studies of quantum distributed learning including quantum federated learning (QFL). QFL is an architecture inspired by the classical FL which is improved by utilizing QML. QFL model has many similarities with classical FL because it inherits many characteristics of the classical version. Multiple devices and a server is involved in QFL which transmit aggregated model parameters repeatedly. In addition, QFL with quantum blind computing is proposed to enhance the privacypreserving characteristic by utilizing differential privacy with quantum computers. However, due to the qubit constraint, the application of QFL is severely limited. In [chen2021federated, huang2022quantum], binary classification is used as the simulation environment which is a relatively simple task. This is because leveraging spatial information requires many qubits. To cope with this issue, hybridQFL is proposed [chen2021federated], where the pretrained feature extractor extracts lowdimensional features from images. On the other hand, in our work, we focus on overcoming all these issues of QFL. We will elaborate on this in the next section.
Iii Quantum Split Learning
Iiia Setup
In this paper, local devices and one server participate in our proposed quantum split learning. As shown in Fig. 2, each local device has its local data , and quantum neural networks. Each local device has QCNNs and one crosschannel pooling QNN. The local data and local QCNNs cannot be shared with anyone, whereas features and labels can be shared. The server processes the feature and finally obtains corresponding predictions.
IiiB Localside Operations
Quantum Image Processing. The classical data of local is denoted as
, and the QCNN filters scan the image with stride
where the filter size is . In this paper, we describe the patch , where denotes the number of output channels. The patch is uploaded to QCNN via the datareuploading method referred to (2). We adopt a scalable quantum convolutional neural network (sQCNN [baek2022scalable]) as QCNN. With QCNN and a patch , the feature is obtained, which is denoted as , where denotes the number of output channels.CrossChannel Pooling QNN. Similarly, the probability amplitude is obtained by QNN, and the output is denoted as , where . We propose a crosschannel pooling method by innerproducting of and . Therefore, the output is expressed as follows,
(5) 
After scanning all patches, the output features are obtained . Then, the local device transmits its featurelabel set to server. Note that the featurelabel set is a classical datum, even if it is processed with QNN.
IiiC Serverside Operations
Prediction. On the server side, QNN processes the features. The features are processed with QCNNs and QNN. The operations of QCNNs are the same as the local side, whereas the operation of QNN makes the multiclass classification. For this, we leverage the projectionvalued measure which is one of the quantum state projection method [yun2022projection]. Then, we have a prediction value when quantum state is projected into bases states, i.e., . Note that our method is not based on the softmax function, i.e., it doesn’t require softmax tuning parameters [jerbi2021variational].
QNN Training. We elaborate on how to train QNN. The overall process is presented in Algorithm LABEL:alg:qsl. First of all, th local device generates crosschannel pooled features from its data (lines 4–5). The local device transmits the features and labels to the server (line 6). After the server receives all features and labels, the server makes predictions by feedforwarding features. For simplicity, we denote the trainable parameters as . We use the objective function for PVM, suggested by [yun2022projection], as follows,
(6) 
where , and denote the minibatch, the binary cross entropy, and regularizer term. The terms and are expressed as,
(7)  
(8) 
where denotes the prediction value on class obtained by (3). Finally, in lines 8–12, the loss gradients of local QNNs and global QNN are calculated and updated by (4).
algocf[t]
Iv Performance Evaluation
Iva Setup
The benchmark schemes are designed as follows.

[leftmargin=10pt]

Advantage of QSL: We compare the proposed framework to QFL with federated averaging (named ‘QFedAvg’), and standalone training. We adopt top1 accuracy for the metric. For QFedAvg, we adopt 10 local iterations per communication round.

Effectiveness of crosschanneling pooling: We conduct an ablation study of crosschannel pooling. To benchmark, we design the comparison framework as QSL without any pooling and QSL with channelaveraging pooling. For this, we measure top1 accuracy and communication cost.

Scalability: Since the scalability of the number of local devices is important in distributed machine learning, we conduct a various number of local devices, i.e., .
We have conducted 10 classes of classification tasks with MNIST, FashionMNIST, and CIFAR10, where all data are bicubic interpolated to
size. Note that all data are independent and identically distributed. All local devices have 512 data for each, and the same as the batch size. In addition, each local and server models consist of trainable parameters where U3 and CU3 gates are utilized [u3gate, cu3gate]. We use Adagrad optimizer with an initial learning rate . In addition, we have conducted all experiments in a classical computer with Torchquantum and python library [quantumnas].Metric  MNIST  FashionMNIST  CIFAR10 

Standalone  
QFL  
QSL 
IvB Experimental Results
Performance. Table I shows the top1 accuracy. Regarding MNIST dataset, quantum split learning shows 1.64% and 6.83% higher than QFedAvg and standalone training. In addition, QSL shows 2.37% and 6.76% higher accuracy than QFedAvg and standalone training. Lastly, a similar tendency has been observed regarding CIFAR10. Thus, our algorithm outperforms other algorithms.
Effectiveness of C2Pool. Table II shows the various results of C2Pool ablation. Regarding top1 accuracy, QSL with averagepooling and without pooling are inferior to QSL with C2pool. In addition, QSL with C2Pool shows the smallest domain gap between train data and test data. In the communication aspect, we calculate the communication cost per feature as follows,
(9) 
The average pooling and C2Pool reduce 16 channels to a single channel, whereas QSL without pooling transmits the original feature that consists of 16 channels to the server, respectively. In summary, our proposed C2Pool improves performance as well as reduces gap and communication costs.
Scalability. Fig. 4 shows the experiment that increase the number of devices from to . When the number of local devices is 1, it means standalone training. When , the top1 accuracy achieves 68.81%, which is the highest. Therefore, it can be confirmed that higher accuracy can be obtained, as the number of local devices increases.
PrivacyPreserving Aspect. Fig. 5 presents the various feature representation corresponding to different quantum split learning frameworks. It is apparent that the all features is harder recognizable than the original image, because the original image becomes distorted after passing through convolution. Especially in C2Pool, the outline of digits does not seem clear, whereas convolution without pooling and convolution with averagepooling clearly show the outline. This is, our proposed C2Pool has a strength to privacypreserving aspect.
Metric  Train Acc.  Test Acc.  Commun. Cost [Byte] 

w.o/ pooling  
w/ avgpooling  
w/ C2Pool 
V Conclusions and Future Work
In this paper, we propose a quantum split learning framework. To utilize the full potential of quantum split learning, we propose the probability amplitudebased crosschannel pooling method. We corroborate that our proposed framework achieves reasonable performance, scalability on the number of local devices, low communication cost, and privacypreserving in 10 classification tasks. Since this work is the first quantum split learning framework, the extension to the detailed split learning can be one research direction. Our future work is for analyzing the differential privacy in our proposed crosschannel pooling method.