I Introduction
To date, the rapid evolution of communication networks has been predominantly driven by a chase for more bandwidth. Nonetheless, with the emergence of new Internet of Everything (IoE) services such as holographic teleportation, digital twins, and the immersive metaverse, leveraging computing resources intelligently has become a key necessity to deliver fundamentally complex and smart applications [chaccour2021edge]. Indeed, future 6G applications necessitate an overarching artificial intelligence (AI)native communication infrastructure to fulfill their diversified, stringent requirements (e.g. highresolution sensing, realtime control, high rates and high reliability lowlatency) [saad2019vision, chaccour2022seven]. Hence, this need inherently requires a paradigm shift with respect to the way a transmitter and receiver communicate and perceive each other. In other words, to satisfy AInativeness, future communication networks must rely further on utilizing computing resources in a computingintensive approach while minimizing, as much as possible, the reliance on communication resources.
Towards this goal, current wireless networks must be reengineered to remodel the transmission of information from being a mere reconstruction process of raw bits to one that inherently relies more heavily on understanding and sending the meaning of the information bits. In this new communication paradigm, the transmitter becomes a speaker that tries to capture the contextual meaning of the data, and, then, embed that meaning into a mathematical representation that accurately represents it, called a semantic representation. Finally, the speaker sends the semantic representations to the receiver, which is now a listener that tries to capture and understand the conveyed meaning from the speaker. Particularly, the listener must acquire the ability to reason and decipher the semantic representations so as to transform the information transmission from a bitbybit encoding, to a contextual descriptive one.
Designing a new semantic communication scheme faces fundamental challenges ranging from reengineering the physical layer operation of current wireless networks to developing learning agents that can operate in nearreal time. Chief among those challenges is the complexity associated with extracting contextual information and representing semantics in a generalizeable fashion. In fact, successfully representing the data while conveying its meaning and respective context across a communication medium requires satisfying three major criteria: 1) A minimalist representation that enables the transmission of more information for less resources, 2) An efficient representation that ensures transmitting the principal semantic features of the message that are relevant to the context of the data, in contrast to spurious features, thus ensuring that the listener can effectively decode the meaning, and 3) An accurate, highquality transmission of the semantic features from the speaker to the listener, while maintaining robustness to semantic errors.
Ia Prior Works
A number of recent works related to semantic communications appeared in [zhou2021semantic, xie2021task, tong2021federated, farshbafan2021common]. The work in [zhou2021semantic] proposed a semantic communication scheme using universal adaptive transformers, however, it was only applicable to text data transmissions. In [xie2021task], a multiuser semantic communication approach is proposed for multimodal data. The authors in [tong2021federated]
developed a semantic communication system for audio transmission using convolutional neural networks and federated learning. While the works in
[zhou2021semantic, xie2021task, tong2021federated] present valuable insights, their semantic approach is limited to specific data types. Meanwhile, the work in [farshbafan2021common] proposed an approach to construct a perfect semantic representation of data by starting from a comprehensive subset of all events and, then, perform pruning. While the methodology of [farshbafan2021common] is meaningful, the solution requires iterative processes that can incur nontrivial delays to the communication process. Clearly, a key gap in this prior work is the lack of rigorous approaches that enable extracting semantic representations from different data types. Thus, there is a need for a generalizable data representation approach that can decipher hidden patterns with respect to any data type in an efficient and minimalist manner.To address this challenge, we shed light on significant similarities and common principles in the mathematical techniques underlying both AI and quantum computing (QC), particularly in the area of vector spaces, which plays a fundamental role in semantic modeling [widdows2021quantum]. In fact, we argue that quantum information theory (QIT) has a unique potential to enable the extraction and representation of semantic features of data [chehimi2022physics]. This claim is backed by the proven unprecedented capabilities of quantum mechanics in producing atypical data patterns [biamonte2017quantum]
, in addition to the promising potential for the utilization of quantum vector models to analyze and develop semantic representations
[tetlow2022towards]. Clearly, it is necessary to investigate this unexplored avenue which can ultimately open the door for various opportunities and benefits for the design, deployment, and analysis of semantic communication systems.IB Contributions
The main contribution of this work is, thus, a novel quantum semantic communications (QSC) framework that leverages the principles of QIT for developing semantic representations applicable to diverse services and data types. In particular, we first apply the principles of quantum embedding and quantum feature maps to elevate data into quantum states in highdimensional Hilbert spaces. Then, to congregate the representations into their contextual significance, we perform quantum clustering on the data by exploiting the concept of unsupervised quantum machine learning (QML). This enables minimizing the distance between the embedded states and obtaining an efficient semantic representation that corresponds to an accurate contextual knowledge. Subsequently, to transmit the data, we integrate quantum communications in our QSC framework. In particular, we employ the concept of quantum entanglement in order to utilize less communication resources and mitigate the high overhead of iterative semantic transmissions. Further, we develop a novel performance evaluation framework to asses the QSC framework and to quantify the different quantum losses and noise sources so as to capture quantum semantic errors. To the best of our knowledge, this is the first work that promotes an integration of quantum embedding representations, QML, and quantum communications to design a novel semantic communication scheme. Simulation results validate that, unlike existing classical semantic communication frameworks and semanticagnostic quantum communication schemes, the QSC framework achieves minimality, efficiency, and accuracy in the extraction and transmission of the contextual meaning of the data.
Ii System Model
Consider a semantic communication system in which a speaker, , observes and collects large amounts of raw data, in dataset , (e.g., images, text, etc.) from their surrounding environment. The speaker’s intent is to transmit semantic representations of their conveyed message to a remote listener, . In contrast to classical communications, whereby the receiver’s goal is usually to reconstruct the exact bitwise message sent, herein, the main objective of the listener is to understand the meaning (semantic) conveyed. Next, we delve into our novel semantic representation methodology, which is the first such technique in the literature that operates independently of the data type. This framework relies on the principles of QIT, highdimensional Hilbert spaces, and QML.
Iia Quantum Semantic Information with Quantum Embeddings
The smallest unit of quantum information is a qubit
. Unlike classical binary bits, qubits can be in any superposition of both 0 and 1 bits. A general qubit, or a quantum state in twodimensional Hilbert space, is defined as:
where . Although a qubit can be in any superposition of the and states, when a qubit is measured, its quantum state collapses to either a 0 or 1.In classical communications, multiplexing enables us to combine multiple messages into a shared medium (frequency, time, or space). Similarly, in the quantum world, a quantum state that can hold a superposition of basis vectors is called a qudit, and it contains information in classical bits. This superposition inherently enables us to enhance the capacity of the information contained within a quantum state. The general representation of a qudit is a vector in a dimensional Hilbert space represented as:
(1) 
where . The quantum state can also be represented within a density matrix representation as a sum of pure states i.e., , where
is the selection probability for the individual states. In this formulation, the
dimensional quantum state can be seen as an analogy of the concept of finite vocabulary in the information theoretic domain. Thus, the orthogonal basis vectors {} spanning the Hilbert space construct a common language, i.e., a vocabulary of contextual meanings, that can help in the creation of a proper semantic representation of the data [tetlow2022towards]. Here, every superposition of the basis vectors corresponds to a unique contextual meaning that is part of the common language.To develop the notion of quantum semantics, we mainly leverage the characteristics of highdimensional quantum states in representing semantics jointly with the use of QML techniques, as discussed in Section IIIA. Particularly, embedding the classical data in a highdimensional Hilbert space, followed by the use of quantum clustering techniques would result in efficient extraction of the hidden data patterns and their contextual meaning.
We encode the classical data from the set into quantum states in the dimensional Hilbert space by using a quantum feature map : . Here, the Hilbert space’s dimension, , is much larger than the dimension of the classical dataset . The quantum feature map , which maps , is implemented via a quantum circuit , called the quantumembedding circuit. This circuit first takes classical data as input that is applied in a ground state, , in the dimensional Hilbert space . Essentially, such a quantum feature map yields the quantum embedded states according to: [schuld2019quantum]. After constructing the quantumembedded semantic representations of the classical data, next, we discuss the quantum communication scheme between the speaker and the listener.
IiB Quantum Communication of Semantic Representations
As per Figure 1, prior to transmission, our “bit” counterpart is an embedded quantum state. Here, every quantum state is inherently a semantic representation of the information to be transmitted. Hence, any dimensional vector in this space corresponds to a unique meaning (see Section IIA). Since the semantic representations encompass highdimensional quantum states, adopting a quantum communication scheme would preserve the semantics transferred from the speaker to the listener. In essence, if such quantum states were to be downconverted to classical data, performing measurements and collapsing the quantum states before transmission increases the risk of semantic errors. Also, explicitly quantifying the quantumsemantic error is not possible in a classical network due the absence of metrics for this purpose. Consequently, to guarantee a successful transmission of the semanticembedded quantum states, we propose the following methodology:

First, each classical data sample is embedded into a semanticrepresenting dimensional quantum state vector stored in a quantum random access memory (QRAM) using quantum feature maps, as explained previously.

Then, the speaker applies quantum clustering techniques to construct an efficient representation of the quantum semantics (see Section IIIA). This results in , dimensional quantum states capturing the semantic information of the input raw data.

Next, to represent a highdimensional quantum state (qudit) using a photo of light, we leverage the concept of orbital angular momentum (OAM). OAM is a physical property of an electromagnetic wave and it corresponds to the phase of its angular momentum [chaccour2022seven]. The topological charges of OAM, i.e., the modes of OAM, are orthogonal and enable us to exploit orthogonal basis vectors that represent a qudit in a dimensional Hilbert space . In other words, an arbitrary dimensional qudit is generated via OAM modes. In fact, the quantum number , which represents a topological charge in OAM, is unbounded, and can thus yield arbitrarily large Hilbert spaces. Nonetheless, due to some experimental considerations, some constraints may be imposed on [cozzolino2019high].^{1}^{1}1It is important to note that there are various approaches to implement qudits other than encoding them with OAM, in addition to different techniques for OAM itself, such as the spontaneous parametric downconversion process and waveshaping devices. However, such hardwarerelated details are out of this short paper’s scope, see [cozzolino2019high] and references therein for more details. Thus, pairs of entangled photons with opposite OAM quantum numbers are generated on the speaker’s side. The theoretical states produced for those photons are given by where represent the states of the two generated photons with OAM encoding, and the complex probability amplitudes are represented by .

To initiate the quantum entanglement link, one of the generated entangled photons is transmitted to the listener over a quantum channel (fiber or freespace optical channel). The listener then detects the transmitted entangled photon and stores it in quantum memory. Subsequently, entanglement purification protocols [pan2001entanglement] can be applied if needed (see Section IIIB).

Given that the entanglement link between the speaker and the listener is now established, the speaker maps each of the semanticrepresenting dimensional quantum state vectors to one of its entangled respective photons.

Finally, the quantum teleportation protocol is applied to transfer the semantics to the listener, and the listener performs quantum measurements and some quantum gates to recover the embedded semantics and recover, through quantum operations, the context from raw data.
Next, we discuss our proposed novel approaches to assess the performance of the QSC framework.
Iii Quantumembedded Semantics Analysis
We now introduce two novel approaches that empower our QSC approach with minimalism, efficiency, and accuracy. On the one hand, we leverage QML to guarantee minimalism and efficiency in the quantum semantic representations used. On the other hand, we develop a novel framework for the analysis of quantum semantic errors by leveraging concepts such as quantum fidelity and quantum entropy to guarantee satisfying the three, previouslymentioned, criteria during the extraction and transfer processes of the quantum semantic representations.
Iiia Quantum Clustering for Semantics’ Extraction
To equip our QSC framework with semanticaware capabilities, we leverage QML techniques, which are wellknown for discovering hidden statistical patterns in data [biamonte2017quantum]. Such techniques, when integrated with highdimensional Hilbert spaces, could lead to novel quantum semantic representations. This application of QML distinguishes our QSC framework from existing semanticagnostic quantum frameworks [lloyd2020quantum, chehimi2021entanglement], which would mainly embed classical data into quantum states and send them towards the receiver. If we are to solely rely on this criterion here, then, the system would create a number of quantum states that is equal to the number of classical data samples. This would no longer result in a minimalist and efficient semantic representation, and it would have poor abilities in extracting the contextual meaning of the classical data.
In contrast to existing semanticagnostic quantum frameworks [lloyd2020quantum, chehimi2021entanglement], our proposed QSC framework performs the following steps for extracting the quantum semantics. Initially, each raw data sample is embedded into a dimensional quantum state , and then stored in customized QRAM [kerenidis2016quantum] structures. Next, an unsupervised quantum clustering process, e.g., quantum nearest neighbors algorithm [kerenidis2019q], is performed on the QRAM. As a result, the number of quantum states is significantly minimized. Without loss of generality, in this algorithm, we specify a number of clusters , each of which has a quantum vector representing its center, or centroid, , .
In general, after quantum embedding, the dataset has samples embedded into quantum vectors, , , in the dimensional Hilbert space
. These quantum vectors are initialized to different clusters either in an arbitrary fashion or by utilizing efficient heuristic approaches
[vassilvitskii2006k]. Then, multiple iterations, as explained next, are performed to ensure that each quantum vector is assigned to the cluster with the nearest centroid vector. Here, a measure of distance (or similarity) between vectors in the Hilbert space is needed, such as the Euclidean distance, the SWAP test, or different quantum states discrimination approaches [wiebe2014quantum]. In each iteration, the distance between each vector and the different centroids is measured, and the vector is assigned to the cluster that has the closest centroid to that vector. As such, similar data vectors are clustered to the same group by being assigned to the same centroids, which helps discovering hidden patterns that could not be explored in the raw classical data without QML. The quantum clustering technique enables us to identify data samples that share various similarity in their contextual meaning, and it then assigns them to unique clusters with centroid vectors capturing those meanings or semantics. Ultimately, the centroids are the highdimensional quantum vectors that capture the semantics of the data, and they are transferred to the listener.IiiB Qsc Performance Analysis Framework
Next, we develop a comprehensive framework for analyzing the performance of QSC. Particularly, the analysis framework incorporates the necessary metrics to ensures minimalism, efficiency, and accuracy within the quantum semantics’ extraction and transmission processes. This novel framework is unique, compared to existing works [zhou2021semantic, xie2021task, tong2021federated, farshbafan2021common], since it guarantees a successful semantic assessment and quantifies quantum semantic errors, as explained next.

Minimalism is investigated by characterizing the number of quantum communication resources, i.e., the entangled photons with OAM encoding, consumed while conveying the quantum semantics to the listener. Given that our QSC framework adopts quantum clustering, the number of communication resources is minimized. In a semanticagnostic quantumembedding framework, we have , which means that the number of quantum communication resources is equal to the number of data samples. Meanwhile, the QSC framework reduces the communication resources to , i.e., the number of clusters, where .

Efficiency within the semantic extraction process is considered by characterizing the associated losses that the quantum computing devices undergo during the quantum embedding and clustering processes. In particular, today’s noisy intermediatescale quantum (NISQ) devices incorporate noise and various unavoidable losses that affect the quantum circuits employed in the QSC framework. These losses render the process of generating a perfectlypure quantum state that embeds the classical data samples nearly impossible. Here, a widelyadopted example that enables illustrating this phenomenon is the depolarizing noise. It can be represented as a quantum channel that describes the scenario in which the quantum state is completely lost with probability [hubregtsen2021training].
In particular, is a realvalued parameter that maps a quantum state into a linear combination that comprises itself, and the identity matrix . Thus, the noise model is represented by the map , which is tracepreserving and completely positive, and is represented as [king2003capacity]:(2) It is important to note that parameter is bounded by: so as to guarantee the complete positivity condition.
To capture how efficient a quantum state captures the semantic information in presence of noise, we calculate the minimal output Von Neumann entropy achieved after the quantum embedding and clustering processes for the depolarizing channel, as follows:where a higher entropy implies more efficient capacity for capturing the contextual meaning of the data.

Accuracy of the recovered semantic quantum states at the listener with respect to the conveyed ones at the speaker, is quantified by characterizing the quantum fidelity of the considered recovered semantic quantum states. Here, quantum fidelity captures the unavoidable losses that every quantum state undergoes when transmitted over quantum channels.
For instance, the evolution of a quantum state in an open quantum system is commonly measured using the Kraus operators [kraus1983states], which can correspond to multiple wellknown noise sources encountered during the evolution of quantum states during the application of the quantum teleportation protocol. Again, here, we consider the depolarizing noise in the quantum channel between the speaker and the listener as a sample model to capture the losses encountered during state transmission. When this noise is considered, the initial quantum state of the photon transmitted from speaker to listener evolves, with probability to a maximally mixed state [fonseca2019high].
In this regard, when the generated entangled pairs of qudits are considered to have maximally entangled measurements and channels, the average quantum fidelity that corresponds for the depolarizing noise will be [fonseca2019high]:
(3) where is the dimension of the quantum states (qudits).
If a low fidelity is encountered, the entanglement purification process may be applied [pan2001entanglement] in order to achieve a desired minimum quality of the entanglement connection. Meanwhile, when the entanglement connection between the speaker and the listener is characterized with a high fidelity, then the semantic representations will be accurately recovered at the listener. Thus, the QSC framework will have a robust performance against quantum noise, losses, and the associated quantumsemantic errors.
Next, we simulate and asses the performance of our proposed QSC framework using the three unique identified criteria.
Iv Simulation Results and Analysis
For our simulations, we consider a scenario in which the data traffic is modeled according to 3GPP TSGRAN R1070674 [3gpp2007lte]
. This allows us to develop a complex data traffic model that is in conformity with the traffic observed in fifthgeneration (5G) wireless services. Here, we particularly model a gaming data traffic, where the packet size varies according to the largest extreme value probability distribution
with a meanbytes, and a standard deviation
bytes [navarro2020survey].First, in Figure 2, we showcase the advantage of our proposed QSC framework in terms of saving quantum communication resources (entangled photons) compared to a semanticagnostic quantum communications framework. In particular, Figure 2 clearly shows that our QSC framework puts significantly lower burden on communication resources compared to semanticagnostic quantum frameworks (see Section IIIA). From Fig. 2, we observe that the semanticagnostic quantum communications framework is independent of the choice of the dimension , since every data sample is mapped to a dimensional quantum state, regardless of its dimension. In contrast, our QSC framework depends on the dimension of the semanticembedding quantum states and exhibits a nonmonotonic response versus .
Fig. 2 show that, as increases, the benefits of quantum clustering, in terms of communication resources are more pronounced. However, the impact of increasing is also a function of the data traffic. For instance, there is a light traffic at the 100th communication round. In such cases, increasing the dimensionality has a minor effect on enhancing the QSC framework, yet our proposed approach saves around of the needed communication resources compared to the semanticagnostic framework.
In contrast, when the data traffic is heavy, e.g., in communication round , the proposed QSC frameworks reduces the number of communication resources needed from to a minimum of , thus saving around of the resources. In this case, we clearly see the impact of increasing in the QSC framework, since it reduces the number of communication resources from with to with . In this case, the quantum clustering process is capable of representing the semantics in a minimally sufficient fashion. In other words, the dimensions are acting like a perfect basis for a highdimensional vector. Meanwhile, as significantly increases, e.g., , we observe that the number of dimensions starts to negatively affect the communication resources (). This is due to the fact that the number of dimensions becomes redundant to represent the data, and fewer dimensions are needed so as not to worsen the semantic representation configuration. Thus, optimizing the number of dimensions in our QSC framework in particular, and in a quantumsemantic setting framework in general is an important problem that needs to be carefully scrutinized.
In Figure 3, we evaluate the impact of noise and losses in quantum devices and channels on the effectiveness of the QSC framework. Particularly, we validate the importance of increasing the dimension of the Hilbert space on extracting more efficient semantics, and we analyze its impact on the average achieved fidelity when conveying those semantics.
From Figure 3, we observe that the minimal achieved output entropy of the quantum depolarizing noise increases with . This results from the fact that, when the dimensionality is larger, the noise becomes more contracted, and consequently less pervasive. Subsequently, increasing the dimensionality enables our QSC framework to perform a more efficient extraction of quantumsemantic representations and gain robustness against quantumsemantic errors. Moreover, Figure 3 shows that increasing the dimensionality from to leads to a small decay in the quantum fidelity from to when there is a low noise level in the quantum channel. Additionally, in an extremely noisy channel, one can employ entanglement purification techniques [pan2001entanglement] to compensate for the losses in fidelity and ultimately guarantee minimal, efficient, and accurate quantumsemantic representations.
V Conclusion
In this paper, we have proposed a novel QSC framework for the extraction and transmission of quantum semantics and contextual meaning of raw classical data. The proposed framework develops a unique representation of quantum semantics by leveraging the principles of quantum embedding into highdimensional Hilbert spaces, and quantum clustering. Particularly, the QSC framework guarantees a minimalist utilization of quantum communication resources, an efficient extraction of the quantum semantics, and an accurate communication of the extracted quantum semantics. Our results validate the proposed QSC framework and demonstrate its advantages compared to quantum semanticagnostic frameworks.