I Introduction
Holographic displays [1, 2] have emerged as attractive interface techniques for reconstructing three dimensional (3D) scenes that provide full parallax and depth information for human eyes. 3D holographic display can be widely used for many applications: entertainment, remote device operation, medical imaging, and simulated training as shown in Fig. 1. Point cloud [3] is one of data structures to reconstruct 3D scenes/objects on the holographic display [4]. Point cloud is a set of 3D points, and each point is defined by 3D coordinates, i.e., (X, Y, Z) and color attributes, i.e., (R, G, B) or (Y, U, V).
In contrast to conventional two dimensional (2D) images, 3D points in point cloud data are not ordered and are nonuniformly distributed in space. One of major issues in point cloud delivery is how to compress and send such numerous and irregular structure of 3D points while maintaining high 3D reconstruction quality on displays. For example, when the number of 3D points is , the amount of traffic without any compression is approximately Mbits [5]. Large traffic causes low reconstruction quality in point cloud delivery over limited data rate links, especially, wireless communications.
For point cloud compression over wireless links, conventional encoders, such as popular Point Cloud Library (PCL) [6, 7], use octree decomposition, prediction, quantization, and entropy coding. Specifically, a sender first decomposes point cloud into multiple 3D point sets [8] and takes quantization and entropy coding for each point set to generate the compressed bitstream for transmissions. Here, the compression rate of the bitstream is adaptively selected according to the wireless channel quality. After the compression, the transmission part uses a channel coding and digital modulation scheme to reliably transmit the compressed bitstream over wireless channels. Highquality transmissions of point clouds over wireless links can realize immersive video applications such as virtual reality and augmented reality on wireless devices as shown in Fig. 2.
However, the conventional schemes of point cloud delivery suffer from the following problems due to the wireless channel unreliability. First, the encoded bitstream is highly vulnerable for bit errors [11]. When the channel signaltonoise ratio (SNR) falls under a certain threshold, possible few bit errors occurred in the bitstream during communications can cause a synchronization problem in point cloud decoding. As a result, the display does not reconstruct 3D scenes, and thus the reconstruction quality degrades significantly. This phenomenon is called cliff effect [12]. Second, the reconstruction quality does not improve even when the wireless channel quality is improved unless an adaptive rate control of source and channel coding is performed in realtime according to the rapid fading channels. This is called leveling effect. Finally, quantization is a lossy process and its distortion cannot be recovered at the receiver.
As mentioned above, conventional point cloud transmissions have two challenging issues over wireless links: 1) cliff effect and 2) leveling effect. To overcome these issues, we propose a new point cloud transmission scheme to reconstruct 3D scenes in high quality on holographic displays. The key idea of this scheme is skipping nonlinear operations, i.e., quantization and entropy coding, in point cloud coding. For highquality delivery, this study considers 3D points as vertices in a graph , with edges between nearby vertices to deal with irregular structure motivated by [13, 14]. Each point has attributes of 3D coordinates and color components, and those attributes are regarded as signals residing on the vertices of the graph. The proposed scheme takes graph Fourier transform (GFT) [15, 16] for each attribute in graph signals to compact the signal power, whose output is then scaled and directly mapped to transmission signals without relying on digital modulation schemes. The advantage of this modification lies in a fact that the point distortion due to communication noise is proportional to the magnitude of the noise, resulting into graceful reconstruction quality according to the wireless channel quality, without any cliff effect and leveling effect. We demonstrate that the proposed point cloud delivery scheme achieves graceful reconstruction quality with the improvement of wireless channel quality and better reconstruction performance compared to the conventional digitalbased schemes. For example, HoloCast achieves dB and dB improvement in the attributes of the 3D coordinates and color components, respectively, in terms of mean squared error (MSE) compared with the digitalbased delivery schemes.
Related Works and Our Contributions
Soft video delivery schemes have been recently proposed for multidimensional ordered video signals in [17, 18, 19, 20]. For example, SoftCast [17] was designed for 3D ordered video signals to realize graceful video delivery. They skip quantization and entropy coding, and uses 3D discretecosine transform (DCT) and analog modulation, which maps DCT coefficients directly to transmission signals, to ensure that the received video quality is proportional to wireless channel quality. FoveaCast [18] considers the foveation characteristic of human vision into soft video delivery of 2D ordered video signals to achieve higher visual perceptual quality. FreeCast [20] extended the soft video delivery towards 5D ordered multiview video plus depth (MVD) signals. They use 5DDCT for decorrelation and directly send the coefficients to realize graceful quality improvement with the improvement of wireless channel quality.
Our study realizes soft coding and decoding for point cloud delivery. Although existing schemes of soft video delivery deal with ordered and uniformly distributed video signals, point cloud delivery needs to handle nonordered and nonuniformly distributed points in coding and decoding. To this end, the proposed scheme has the following major contributions:

We regard point clouds as graph signals with the attributes of 3D coordinates and color components to deal with irregular structure of holographic data formats.

We introduce GFT and analog modulation for graph signals to exploit correlations in graphdomain for performance improvement.

We discuss an impact of graph Laplacian variants and adjacency hyperparameters on 3D scene reconstruction quality.

We demonstrate that GFTbased HoloCast achieves graceful 3D reconstruction quality with a significant performance improvement over digitalbased point cloud delivery.
Ii HoloCast: Graceful Point Cloud Delivery
The objectives of our study are 1) to prevent cliff effect in 3D scene reconstruction and 2) to gracefully improve reconstruction quality with the improvement of wireless channel quality.
Fig. 3 shows the overview of proposed HoloCast. The encoder first performs GFT for 3D points and the corresponding colors, i.e., graph signals. The GFT coefficients are then scaled and analogmodulated according to the signal power information for wireless transmissions. Next, the encoder sends the analog modulated symbols to the receiver over a wireless channel, which is often impaired with additive white Gaussian noise (AWGN) and timevarying fading. At the receiver side, the decoder uses minimum meansquare error (MMSE) filter to obtain the transmitted GFT coefficients. The decoder finally takes inverse GFT to reconstruct 3D coordinates and color components for display.
Iia Encoder
We first represent 3D points and color components using a weighted and undirected graph where and are the vertex and edge sets of , respectively. is an adjacency matrix having positive edge weights and the th entry represents the weight of an edge connecting vertices and . For the graph , we consider the attributes of the point cloud, i.e., the 3D coordinates and the color components as signals that reside on the vertices in the graph ( is the number of vertices). From the attributes, each weight can be calculated, e.g., by the Gaussian kernel as follows:
(1) 
where represents the 3D coordinates of point and
is a hyperparameter. In HoloCast, we use either the sample variance or the standard deviation of distances across all the points for the hyperparameter
. A sender then transforms the graph signals into spectral representation using GFT. The GFT is defined through the graph Laplacian operator using edge weight matrix and degree matrix , where is the diagonal degree matrix whose th diagonal element is equal to the sum of the weights of all the edges incident to vertex . Specifically, the diagonal matrix is represented as:(2) 
Based on the degree matrix, we can calculate some variants of the graph Laplacian matrix [21]:
(3)  
(4)  
(5)  
(6) 
where
denotes an identity matrix of proper dimension. We refer to each graph Laplacian matrix as regular, normalized, transition, and randomwalk Laplacian, respectively. We will discuss the impact of those graph Laplacian matrices on the delivery quality in Section
IIIC. In general, the graph Laplacian is a real symmetric matrix that has a complete set of orthonormal eigenvectors with corresponding nonnegative eigenvalues. To obtain the eigenvectors and eigenvalues, the eigen decomposition of the Laplacian matrix is performed as:
(7) 
where is the eigenvectors matrix and is a diagonal matrix containing the eigenvalues.^{1}^{1}1
For nondiagonalizable graph Laplacian matrix, the singular value decomposition (SVD) is instead used to express as
where , and denote left singular vectors matrix, diagonal matrix containing singular values, and right singular vectors matrix, respectively. In this case, we use the right singular vectors of as the graphbased transform basis matrix . The multiplicity of the smaller eigenvalue indicates the number of connected components of the graph. The GFT coefficients of each attribute of graph signalsare obtained by multiplying the graphbased transform basis matrix by the corresponding attribute vector as follows:
(8) 
where is a vector of GFT coefficients corresponding to the graph signals of . After power allocation for each GFT coefficient, the GFT coefficients are mapped to I (inphase) and Q (quadraturephase) components for analog wireless transmissions.
Let denote the th analogmodulated symbol, which is the th GFT coefficient of all the attributes scaled by a factor of for noise reduction as follows:
(9) 
The optimal scale factor is obtained by minimizing the MSE under the power constraint with a total power budget of as follows:
(10)  
(11) 
where denotes expectation,
is a receiver estimate of the transmitted GFT coefficient,
is the power of the th GFT coefficient, is the number of GFT coefficients, and is a receiver noise variance. As shown in [17], the nearoptimal solution is expressed as(12) 
IiB Decoder
Over the wireless links, the receiver obtains the received symbol, which is modeled as follows:
(13) 
where is the th received symbol and is an effective AWGN with a variance of (which is already normalized by wireless channel strength in the presence of fading attenuation). The GFT coefficients are extracted from I and Q components via an MMSE filter [17]:
(14) 
The decoder then reconstructs corresponding graph signals , i.e., attributes of 3D coordinates and color components, by taking the inverse GFT for the filtered GFT coefficients in each attribute as follows:
(15) 
IiC Analog Compression for Limited Bandwidth
The previous designs assume that the sender has enough bandwidth to transmit all the coefficients in the spectral domain over the wireless medium. If the available bandwidth and/or time resources are restricted for wireless channel use, it has to selectively transmit the coefficients to fit the available bandwidth. For such cases, our scheme sorts the coefficients in descending order of the power and picks higherpower coefficients to fill the bandwidth. When the sender discards a coefficient, the receiver regards the discarded coefficient as zero. As a result, a sort of data compression can be accomplished even for analogbased video delivery. Even when some coefficients are discarded to reduce the amount of data, the receiver can still achieve a graceful video quality until reaching the distortion limit due to the compression.
Iii Performance Evaluation
Iiia Simulation Settings
Performance Metric: We evaluate the reconstruction quality of point cloud delivery in terms of the symmetric MSE based on [22] in each attribute of 3D coordinates and color components . The symmetric MSE of the 3D coordinates, , can be obtained as follows:
(16) 
where is the original 3D coordinates and is the decoded 3D coordinates. Here, each way of the asymmetric MSE in the 3D coordinates are defined as follows:
(17)  
(18) 
The symmetric MSE of the color components, , is derived analogously as follows:
(19) 
where and are the original and decoded color components, respectively. In this case, the asymmetric MSE of the color component is defined as follows:
(20) 
(21) 
where represents the color components of the corresponding 3D coordinates .
Point Cloud Dataset: We use the reference point clouds, namely, pencil__10_0, pencil__9_0, and office1, whose number of points is , , and , respectively. We first focus on pencil__10_0 to compare between HoloCast and digitalbased delivery. The performance at a different number of 3D points will then be evaluated with pencil__9_0. In addition, we demonstrate the visual quality for the point cloud data of office1 in Section IIIE.
Wireless Settings: The received symbols are impaired by an AWGN channel. For digitalbased schemes, we use a rate convolutional codes with a constraint length of . The digital modulation formats are either quadrature phaseshift keying (QPSK), ary quadratureamplitude modulation (16QAM), or 256ary QAM (256QAM).
Digital Point Cloud Coder: We compare HoloCast with the conventional digitalbased delivery, which is based on point cloud digital compression used in PCL [6]. We consider two default profiles for compression: LOW_RES_OFFLINE_COMPRESSION_WITH_COLOR and MED_RES_OFFLINE_COMPRESSION_WITH_COLOR. Note that this digitalbased PCL point cloud delivery does not exploit GFT, while there exist recent work of GFTbased digital compression schemes to improve the efficiency, e.g., [13, 14]. Because the primary objective of our paper is a preliminary demonstration of a new soft delivery technique for point cloud data, we focus on widely used PCLbased digital delivery for benchmark performance comparisons in this paper. Nevertheless, we plan to compare our HoloCast with GFTbased digital compression methods in near future.
IiiB HoloCast vs. Digitalbased Schemes
We first evaluate the quality of HoloCast and conventional digitalbased schemes. Fig. 4 (a) shows the MSE of 3D coordinate attributes in the digitalbased scheme and HoloCast as a function of wireless channel SNRs. Here, HoloCast uses the sample variance of point distances as the hyperparameter and the randomwalk matrix for the graph Laplacian . In addition, we consider additional two HoloCast schemes to demonstrate an impact of GFT on quality improvement: DCTbased decorrelation and no decorrelation. From evaluation results in Fig. 4 (a), we can see the following observations:

HoloCast gracefully improves the reconstruction quality of 3D coordinate attributes with the improvement of wireless channel quality.

Digitalbased schemes suffer from cliff effect at low channel SNR regimes because bit errors cause synthesis errors of entropy decoding and leveling effect at high channel SNR regimes due to quantization errors.

GFTbased HoloCast can achieve better MSE compared with DCTbased HoloCast and HoloCast w/o decorrelation. GFT can utilize correlations of nonordered and nonuniformly distributed 3D points by treating the 3D point data as graph signals.
For example, HoloCast achieves dB and dB improvement compared with HoloCast without decorrelation and DCTbased HoloCast on average across the channel SNRs between 0 dB and 30 dB, respectively.
Fig. 4 (b) also shows the MSE of color component attributes in the digitalbased scheme and HoloCast as a function of wireless channel SNRs. Even in the attributes of the color components, HoloCast realizes graceful quality improvement with the improvement of wireless channel quality. In digitalbased schemes, they have low reconstruction quality even in high channel SNR regimes. It suggests that GFTbased decorrelation has a great advantage to represent point clouds with higher reconstruction quality.
For further quality improvement in color components, we can consider the bilateral Gaussian kernel [23] in each weight to decorrelate color components more efficiently. Specifically, Eq. (1) will be modified as follows:
(22) 
where and are hyperparameters for 3D coordinates and color components, respectively. Our evaluation in the quality of color components verified that the use of bilateral kernel in (22) instead of (1) can offer additional dB gain on average across the channel SNRs between 0 dB and 30 dB.
IiiC Impacts of Graph Laplacian Matrix and Adjacency Hyperparameters
In the previous section, we evaluated the performance of HoloCast using the randomwalk graph Laplacian matrix and variancebased hyperparameter . In HoloCast, different types of graph Laplacian matrix can be used to encode/decode graph signals. In addition, the weight matrix under consideration in Eq. (1) highly depends on the value of Gaussian kernel hyperparameter . For the calculation of , the sample variance (var) or the standard deviation (std) of point distances is often used. In this section, we discuss the effects of graph Laplacian matrix and hyperparameter on the reconstruction quality in detail.
Figs. 5 (a) and (b) show the MSE of 3D coordinate and color component attributes in HoloCast, respectively, with different graph Laplacian matrix and hyperparameter as a function of wireless channel quality. The key results from these figures are summarized as follows:

The randomwalk Laplacian matrix with the hyperparameter of standard deviation achieves the best performance in 3D coordinate attributes while the regular Laplacian matrix with the hyperparameter of variance yields the best quality in the color component attributes.

When we use the normalized and transition matrices as the graph Laplacian operator, the sender should use the standard deviation as the hyperparameter.

Interestingly, HoloCast with the randomwalk Laplacian matrix achieves better 3D coordinate reconstruction using the hyperparameter of the variance, while achieving highquality color components reconstruction using the hyperparameter of the standard deviation.
How to optimize weight matrix and Laplacian matrix is still an open problem. We leave rigorous analysis as future work.
IiiD Impacts of Different Point Clouds
Previous sections use relatively small number of 3D points to demonstrate the benefit of HoloCast. In this section, we consider a larger number of 3D points as the test point cloud to show the scalability of the proposed HoloCast. Figs. 6 (a) and (b) show the MSE of 3D coordinate and color component attributes in HoloCast, respectively, for the point cloud data of pencil__9_0 (). Compared with a small number of 3D points in Figs. 4 (a) and (b), HoloCast achieves better reconstruction quality in both 3D coordinates and color components. For example, HoloCast achieves dB and dB improvement on average across the channel SNRs between 0 dB and 30 dB, respectively.
IiiE Visual Quality
Finally, Fig. 7 compares the visual quality of HoloCast and digitalbased schemes for the reference point cloud of office1. We consider the digitalbased scheme with QPSK modulation format at a channel SNR of dB, where the compressed bitstream can be successfully transmitted to the receiver over wireless channels. Here, HoloCast uses DCT for the decorrelation of the point cloud. The MSE of color attributes achieved by the digitalbased scheme is dB, whereas dB and dB are achieved by HoloCast at wireless channel SNRs of dB and dB, respectively. From the snapshots, we can observe that the digitalbased scheme provides lowerquality point cloud (color degradation at the door). In contrast, HoloCast gracefully improves the reconstruction quality according to available wireless channel quality. Specifically, HoloCast can reproduce a clean 3D scene with details at a higher channel SNR of dB.
Iv Conclusion
In this paper, we proposed HoloCast to realize graceful point cloud delivery over wireless links/networks. In contrast to conventional 2D images, 3D point cloud data are not ordered and are nonuniformly distributed in space. HoloCast regards the 3D points and color components as graph signals and directly transmits lineartransformed signals based on GFT. Evaluation results with several point cloud data showed that HoloCast yields better reconstruction quality even at low wireless channel SNR regimes. Feasibility study over practical experiments for various datasets with reduced amount of metadata will be conducted as future work.
Acknowledgment
T. Fujihashi’s work was partly supported by JSPS KAKENHI Grant Number 17K12672.
References
 [1] P. A. Blanche, A. Bablumian, R. Voorakaranam, C. Christenson, W. Lin, T. Gu, D. Flores, P. Wang, W. Y. Hsieh, M. Kathaperumal, B. Rachwal, O. Siddiqui, J. Thomas, R. A. Norwood, M. Yamamoto, and N. Peyghambarian, “Holographic threedimensional telepresence using largearea photorefractive polymer,” Nature, vol. 468, no. 7320, pp. 80–83, 2010.
 [2] H. Yu, K. Lee, J. Park, and Y. Park, “Ultrahighdefinition dynamic 3D holographic display by active control of volume speckle fields,” Nature Photonics, vol. 11, no. 3, pp. 186–192, 2017.
 [3] R. Mekuria and L. Bivolarsky, “Overview of the MPEG activity on point cloud compression,” in Data Compression Conference, 2016, p. 620.
 [4] P. Su, W. Cao, J. Ma, B. Cheng, X. Liang, L. Cao, and G. Jin, “Fast computergenerated hologram generation method for threedimensional point cloud model,” Journal of Display Technology, vol. 12, no. 12, pp. 1688–1694, 2016.
 [5] M. Preda, “Point cloud compression in MPEG,” 2017.
 [6] J. Kammerl, N. Blodow, R. B. Rusu, S. Gedikli, M. Beetz, and E. Steinbach, “Realtime compression of point cloud streams,” in IEEE International Conference on Robotics and Automation, 2012, pp. 778–785.
 [7] K. Muller, H. Schwarz, D. Marpe, C. Bartnik, S. Bosse, H. Brust, T. Hinz, H. Lakshman, P. Merkle, F. H. Rhee, G. Tech, M. Winken, and T. Wiegand, “3D is here: Point cloud library (PCL),” in IEEE International Conference on Robotics and Automation, 2011, pp. 1–4.
 [8] R. schnabel and R. Klein, “Octreebased pointcloud compression,” in Eurographics Symposium on PointBased Graphics, 2006, pp. 111–121.
 [9] B. Schwarz, “Lidar: Mapping the world in 3D,” Nature Photonics, vol. 4, no. 7, pp. 429–430, 2010.

[10]
K. Rematas, I. KemelmacherShlizerman, B. Curless, and S. Seitz, “Soccer on
your tabletop,” in
IEEE Conference on Computer Vision and Pattern Recognition
, 2018, pp. 1–10.  [11] S. Pudlewski, N. Cen, Z. Guan, and T. Melodia, “Video transmission over lossy wireless networks: A crosslayer perspective,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 1, pp. 6–21, 2015.
 [12] T. Fujihashi, T. KoikeAkino, T. Watanabe, and P. V. Orlik, “Highquality soft video delivery with GMRFbased overhead reduction,” IEEE Transactions on Multimedia, vol. 20, no. 2, pp. 473–483, feb 2018.
 [13] D. Thanou, P. A. Chou, and P. Frossard, “Graphbased compression of dynamic 3D point cloud sequences,” IEEE Transactions on Image Processing, vol. 25, no. 4, pp. 1765–1778, 2016.
 [14] P. de Oliveira Rente, C. Brites, J. Ascenso, and F. Pereira, “Graphbased static 3D point clouds geometry coding,” IEEE Transactions on Multimedia, vol. PP, no. 99, pp. 1–16, 2018.
 [15] A. Ortega, P. Frossard, J. Kovacevic, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges, and applications,” Proceedings of the IEEE, vol. 106, no. 5, pp. 808–828, 2018.
 [16] G. Cheung, E. Magli, Y. Tanaka, and M. K. Ng, “Graph spectral image processing,” Proceedings of the IEEE, vol. 106, no. 5, pp. 907–930, 2018.
 [17] S. Jakubczak and D. Katabi, “A crosslayer design for scalable mobile video,” in ACM Annual International Conference on Mobile Computing and Networking, Las Vegas, NV, sep 2011, pp. 289–300.
 [18] J. Shen, L. Yu, L. Li, and H. Li, “Foveationbased wireless soft image delivery,” IEEE Transactions on Multimedia, vol. 20, no. 10, pp. 2788–2800, 2018.
 [19] D. He, C. Luo, F. Wu, and W. Zeng, “Swift: A hybrid digitalanalog scheme for lowdelay transmission of mobile stereo video,” in ACM International Conference on Modeling, Analysis, and Simulation of Wireless and Mobile Systems, Cancun, Mexico, nov 2015, pp. 327–336.
 [20] T. Fujihashi, T. KoikeAkino, T. Watanabe, and P. V. Orlik, “FreeCast: Graceful freeviewpoint video delivery,” IEEE Transactions on Multimedia, vol. PP, no. 99, pp. 1–11, 2019.

[21]
R. Horaud, “A short tutorial on graph Laplacians, Laplacian embedding, and spectral clustering.” [Online]. Available:
http://csustan.csustan.edu/~tom/LectureNotes/Clustering/GraphLaplaciantutorial.pdf  [22] P. A. Chou, E. Pavez, R. L. de Queiroz, and A. Ortega, “Dynamic polygon clouds: Representation and compression for VR/AR,” Microsoft Research Technical Report, Tech. Rep., 2017.
 [23] X. Liu, G. Cheung, X. Wu, and D. Zhao, “Random walk graph Laplacian based smoothness prior for soft decoding of JPEG images,” IEEE Transactions on Image Processing, vol. 26, no. 2, pp. 509–524, 2017.
Comments
There are no comments yet.