1 Introduction
Image compression is an application of data compression for digital images to lower their storage and/or transmission requirements. Transform coding [8] has been successful to yield practical and efficient image compression algorithms such as JPEG [27] and JPEG2000 [18]. The transformation converts an input to a latent representation in the transform domain where lossy compression (that is typically a combination of quantization and lossless source coding) is more amenable and more efficient. For example, JPEG utilizes the discrete cosine transform (DCT) to convert an image into a sparse frequency domain representation. JPEG2000 replaces DCT with an enhanced discrete wavelet transform.
Deep learning is now leading many performance breakthroughs in various computer vision tasks [13]. Along with this revolutionary progress of deep learning, learned image compression also has derived significant interests [3, 23, 24, 19, 1, 15, 4, 9, 16, 14]
. In particular, nonlinear transform coding designed with deep neural networks has advanced to outperform the classical image compression codecs sophisticatedly designed and optimized by domain experts, e.g., BPG
[5], which is a still image version of the high efficiency video codec (HEVC) standard [22]—we note that very recently, only a few of the learningbased image compression schemes have reached the performance of the stateoftheart BPG codec on peak signaltonoise ratio (PSNR), a metric based on mean squared error (MSE) [16, 14].Ground truth  Ours  BPG (4:4:4)  JPEG2000  JPEG  
Bits per pixel (BPP)  0.1697  0.1697  0.1702  0.1775  
PSNR (dB)  32.2332  31.9404  30.3140  27.3389  
MSSSIM  0.9602  0.9539  0.9369  0.8669 
The resemblance of nonlinear transform coding and autoencoders has been established and exploited for image compression in [3, 23]—an encoder transforms an image (a set of pixels) into a latent representation in a lower dimensional space, and a decoder performs an approximate inverse transform that converts the latent representation back to the image. The transformation is desired to yield a latent representation with the smallest entropy, given a distortion level, since the entropy is the minimum rate achievable with lossless entropy source coding [7, Section 5.3]
. In practice, however, it is generally not straightforward to calculate and optimize the exact entropy of a latent representation. Hence, the ratedistortion (RD) tradeoff is optimized by minimizing an entropy estimate of a latent representation provided by an autoencoder at a target quality. To improve compression efficiency, recent methods have focused on developing accurate entropy estimation models
[1, 15, 4, 16, 14] with sophisticated density estimation techniques such as variational Bayes and autoregressive context modeling.Given a model that provides an accurate entropy estimate of a latent representation, the previous autoencoderbased image compression frameworks optimize their networks by minimizing the weighted sum of the RD pairs using the method of Lagrange multipliers. The Lagrange multiplier introduced in the Lagrangian (see (2)) is treated as a hyperparameter to train a network for a desired tradeoff between the rate and the quality of compressed images. This implies that one needs to train and deploy separate networks for rate adaptation. One way is to retrain a network while varying the Lagrange multiplier. However, this is impractical when we operate at a broad range of the RD curve with fine resolution and the size of each network is large.
In this paper, we suggest training and deploying only one variablerate image compression network that is capable of rate adaptation. In particular, we propose a conditional autoencoder, conditioned on the Lagrange multiplier, i.e., the network takes the Lagrange multiplier as an input and produces a latent representation whose rate depends on the input value. Moreover, we propose training the network with mixed quantization bin sizes, which allows us to adapt the rate by adjusting the bin size applied to the quantization of a latent representation. Coarse rate adaptation to a target is achieved by varying the Lagrange multiplier in the conditional model, while fine rate adaptation is done by tuning the quantization bin size. We illustrate our variablerate image compression model in Figure 1.
Conditional autoencoders have been used for conditional generation [21, 26]
, where their conditioning variables are typically labels, attributes, or partial observations of the target output. However, our conditional autoencoder takes a hyperparameter, i.e., the Lagrange multiplier, of the optimization problem as its conditioning variable. We basically solve multiple objectives using one conditional network, instead of solving them individually using separate nonconditional networks (each optimized for one objective), which is new to the best of our knowledge.
We also note that variablerate models using recurrent neural networks (RNNs) were proposed in
[24, 9]. However, the RNNbased models require progressive encoding and decoding, depending on the target image quality. The increasing number of iterations to obtain a higherquality image is not desirable in certain applications and platforms. Our variablerate model is different from the RNNbased models. Our model is based on a conditional autoencoder that needs no multiple iterations, while the quality is controlled by its conditioning variables, i.e., the Lagrange multiplier and the quantization bin size. Our method also shows superior performance over the RNNbased models in [24, 9].We evaluate the performance of our variablerate image compression model on the Kodak image dataset [12] for both the objective image quality metric, PSNR, and a perceptual score measured by the multiscale structural similarity (MSSSIM) [28]. The experimental results show that our variablerate model outperforms BPG in both PSNR and MSSSIM metrics; an example from the Kodak dataset is shown in Figure 2. Moreover, our model shows a comparable and sometime better RD tradeoff than the stateoftheart learned image compression models [16, 14] that outperform BPG by deploying multiple networks trained for different target rates.
2 Preliminary
We consider a typical autoencoder architecture consisting of encoder and decoder , where is an input image and is a quantized latent representation encoded from the input with quantization bin size ; we let , where denotes elementwise rounding to the nearest integer. For now, we fix . Lossless entropy source coding, e.g., arithmetic coding [7, Section 13.3], is used to generate a compressed bitstream from the quantized representation . Let , where
is the probability density function of
.Deterministic quantization. Suppose that we take entropy source coding for the quantized latent variable and achieve its entropy rate. The rate and the squared L2 distortion (i.e., the MSE loss) are given by
(1) 
where is the probability density function of all natural images, and is the probability mass function of induced from encoder and , which satisfies , where denotes the Dirac delta function. Using the method of Lagrange multipliers, the RD optimization problem is given by
(2) 
for ; the scalar factor in the Lagrangian is called a Lagrange multiplier. The Lagrange multiplier is the factor that selects a specific RD tradeoff point (e.g., see [17]).
Relaxation with universal quantization. The rate and the distortion provided in (1) are not differentiable for network parameter , due to and , and thus it is not straightforward to optimize (2) through gradient descent. It was proposed in [3] to model the quantization error as additive uniform stochastic noise to relax the optimization of (2). The same technique was adopted in [4, 16, 14]. In this paper, we instead propose employing universal quantization [30, 29] to relax the problem (see Remark 2).
Universal quantization dithers every element of
with one common uniform random variable as follows:
(3) 
where the dithering vector
consists of repetitions of a single uniform random variable with support . We fix just for now. In each dimension, universal quantization is effectively identical in distribution to adding uniform noise independent of the source, although the noise induced from universal quantization is dependent across dimensions. Note that universal quantization is approximated as a linear function of the unit slope (of gradient) in the backpropagation of the network training.
Remark 1.
To our knowledge, we are the first to adopt universal quantization in the framework of training image compression networks. In [6], universal quantization was used for efficient weight compression of deep neural networks, which is different from our usage here. We observed from our experiments that our relaxation with universal quantization provides some gain over the conventional method of adding independent uniform noise (see Figure 3).
Differentiable RD cost function. Under the relaxation with universal quantization, similar to (1), the rate and the distortion can be expressed as below:
(4) 
where . The stochastic quantization model makes have a continuous density , which is a continuous relaxation of , but still is usually intractable to compute. Thus, we further adopt approximation of to a tractable density that is differentiable with respect to and . Then, it follows that
(5) 
where denotes KullbackLeibler (KL) divergence (e.g., see [7, p. 19]); the equality in holds when . The choice of in our implementation is deferred to Section 4 (see (12)–(14)).
From (2) and (4), approximating by its upperbound in (5), the RD optimization problem reduces to
(6) 
for . Optimizing a network for different values of , one can trade off the quality against the rate.
Remark 2.
The objective function in (6) has the same form as autoencoding variational Bayes [11], given that the posterior is uniform. This relation was already established in the previous works, and detailed discussions can be found in [3, 4]. Our contribution in this section is to deploy universal quantization (see (3)) to guarantee that the quantization error is uniform and independent of the source distribution, instead of artificially adding uniform noise, when generating random samples of from in Monte Carlo estimation of (6).
3 Variable rate image compression
To adapt the quality and the rate of compressed images, we basically need to optimize the RD Lagrange function in (6) for varying values of the Lagrange multiplier . That is, one has to train multiple networks or retrain a network while varying the Lagrange multiplier . Training and deploying multiple networks are not practical, in particular when we want to cover a broad range of the RD curve with fine resolution, and each network is of a large size. In this section, we develop a variablerate model that can be deployed once and can be used to produce compressed images of varying quality with different rates, depending on user’s requirements, with no need of retraining.
3.1 Conditional autoencoder
To avoid training and deploying multiple networks, we propose training one conditional autoencoder, conditioned on the Lagrange multiplier . The network takes as a conditioning input parameter, along with the input image, and produces a compressed image with varying rate and distortion depending on the conditioning value of . To this end, the rate and distortion terms in (4) and (5) are altered into
for , where is a predefined finite set of Lagrange multiplier values, and then we minimize the following combined objective function:
(7) 
To implement a conditional autoencoder, we develop the conditional convolution, conditioned on the Lagrange multiplier , as shown in Figure 4. Let be a 2dimensional (2D) input feature map of channel and be a 2D output feature map of channel . Let be a 2D convolutional kernel for input channel and output channel . Our conditional convolution yields
(8) 
where denotes 2D convolution. The channelwise scaling factor and the additive bias term depend on by
(9) 
where and are the fullyconnected layer weight vectors of length for output channel ; denotes the transpose, , and
is onehot encoding of
over .Remark 3.
The proposed conditional convolution is similar to the one proposed by conditional PixelCNN [26]. In [26], conditioning variables are typically labels, attributes, or partial observations of the target output, while our conditioning variable is the Lagrange multiplier, which is the hyperparameter that trades off the quality against the rate in the compression problem. A gatedconvolution structure is presented in [26], but we develop a simpler structure so that the additional computational cost of conditioning is marginal.
3.2 Training with mixed bin sizes
(a) Vary for fixed  (b) Vary for fixed  (c) Vary the mixing range of in training 
We established a variablerate conditional autoencoder model, conditioned on the Lagrange multiplier in the previous subsection, but only finite discrete points in the RD curve can be obtained from it, since is selected from a predetermined finite set .^{1}^{1}1 The conditioning part can be modified to take continuous values, which however did not produce good results in our trials. To extend the coverage to the whole continuous range of the RD curve, we develop another (continuous) knob to control the rate, i.e., the quantization bin size.
Recall that in the previous RD formulation (1), we fixed the quantization bin size , i.e., we simply used for quantization. In actual inference, we can change the bin size to adapt the rate—the larger the bin size, the lower the rate. However, the performance naturally suffers from mismatched bin sizes in training and inference. For a trained network to be robust and accurate for varying bin sizes, we propose training (or finetuning) it with mixed bin sizes.
In training, we draw a uniform noise in (3) for various noise levels, i.e., for random . The range of and the mixing distribution within the range are design choices. In our experiments, we choose , where is uniformly drawn from so we can cover . The larger the range of , we optimize a network for a broader range of the RD curve, but the performance also degrades. In Figure 5(c), we compare the RD curves obtained from the networks trained with mixed bin sizes of different ranges; we used fixed in training the networks just for this experiment. We found that mixing bin sizes in yields the best performance, although the coverage is limited, which is not a problem since we can cover largescale rate adaptation by changing the input Lagrange multiplier in our conditional model (see Figure 5 (a,b)).
In summary, we solve the following optimization:
(10) 
where is a predefined mixing density for , and
(11) 
Remark 4.
In training, we compute neither the summation over nor the expectation over in (10). Instead, we randomly select uniformly from and draw from for each image to compute its individual RD cost, and then we use the average RD cost per batch as the loss for gradient descent, which makes the training scalable.
3.3 Inference
Rate adaptation. The rate increases, as we decrease the Lagrange multiplier and/or the quantization bin size . In Figure 5(a,b), we show how the rate varies as we change and . In (a), we change within for each fixed from (15). In (b), we vary in while fixing for some selected values. Given a user’s target rate, largescale discrete rate adaptation is achieved by changing , while fine continuous rate adaptation can be performed by adjusting for fixed . When the RD curves overlap at the target rate (e.g., see BPP in Figure 5(a)), we select the combination of and that produces better performance.^{2}^{2}2In practice, one can make a set of preselected combinations of and , similar to the set of quality factors in JPEG or BPG.
Compression. After selecting , we do onehot encoding of and use it in all conditional convolutional layers to encode a latent representation of the input. Then, we perform regular deterministic quantization on the encoded representation with the selected quantization bin size . The quantized latent representation is then finally encoded into a compressed bitstream with entropy coding, e.g., arithmetic coding; we additionally need to store the values of the conditioning variables, and , used in encoding.
Decompression. We decode the compressed bitstream. We also retrieve and used in encoding from the compressed bitstream. We restore the quantized latent representation from the decoded integer values by multiplying them with the quantization bin size . The restored latent representation is then fed to the decoder to reconstruct the image. The value of used in encoding is again used in all deconvolutional layers, for conditional generation.
4 Refined probabilistic model
In this section, we discuss how we refine the baseline model in the previous section to improve the performance. The model refinement is orthogonal to the rate adaptation schemes in Section 3. From (11), we introduce a secondary latent variable that depends on and to yield
For compression, we encode from , and then we further encode from . The encoded representations are entropycoded based on , respectively. For decompression, given , we decode , which is then used to compute and to decode
. This model is further refined by introducing autoregressive models for
and as below:(12) 
where is the th element of , and . In Figure 6, we illustrate a graph representation of our refined variablerate image compression model.
In our experiments, we use
(13) 
where , , and denotes the standard normal density; and are parameterized with autoregressive neural networks, e.g., consisting of masked convolutions [26], which are also conditioned on as in Figure 4. Similarly, we let
(14) 
where , , and is designed as a univariate density model parameterized with a neural network as described in [4, Appendix 6.1].
5 Experiments
We illustrate the network architecture that we used in our experiments in Figure 7. We emphasize that all convolution (including masked convolution) blocks employ conditional convolutions (see Figure 4 in Section 3.1).
Training
. For a training dataset, we used the ImageNet ILSVRC 2012 dataset
[20]. We resized the training images so that the shorter of the width and the height is , and we extracted patches at random locations. In addition to the ImageNet dataset, we used the training dataset provided in the Workshop and Challenge on Learned Image Compression (CLIC)^{3}^{3}3https://www.compression.cc. For the CLIC training dataset, we extracted patches at random locations without resizing. We used Adam optimizer [10] and trained a model for epochs, where each epoch consists of k batches and the batch size is set to . The learning rate was set to be initially, and we decreased the learning rate to and at and epochs, respectively.We pretrained a conditional model that can be conditioned on different values of the Lagrange multiplier in for fixed bin size , where
(15) 
In pretraining, we used the MSE loss. Then, we retrained the model for mixed bin sizes; the quantization bin size is selected randomly from , where is drawn uniformly between and so that we cover . In the retraining with mixed bin sizes, we used one of MSE, MSSSIM and combined MSE+MSSSIM losses (see Figure 9). We used the same training datasets and the same training procedure for pretraining and retraining. We also trained multiple fixedrate models for fixed and fixed for comparison.
Experimental results. We compare the performance of our variablerate model to the stateoftheart learned image compression models from [19, 15, 4, 16, 9, 14] and the classical stateoftheart variablerate image compression codec, BPG [5], on the Kodak image set [12]. Some of the previous models were optimized for MSE, and some of them were optimized for a perceptual measure, MSSSIM. Thus, we compare both measures separately in Figure 8. In particular, we included the results for the RNNbased variablerate compression model in [9], which were obtained from [4]. All the previous works in Figure 8, except [9], trained multiple networks to get the multiple points in their RD curves.
For our variablerate model, we plotted curves of the same blue color for PSNR and MSSSIM, respectively, in Figure 8. Each curve corresponds to one of Lagrange multiplier values in (15). For each , we varied the quantization bin size in to get each curve. Our variablerate model outperforms BPG in both PSNR and MSSSIM measures. It also performs comparable overall and better in some cases than the stateoftheart learned image compression models [16, 14] that outperform BPG by deploying multiple networks trained for varying rates.
Our model shows superior performance over the RNNbased variablerate model in [9]. The RNNbased model requires multiple encoding/decoding iterations at high rates, implying the complexity increases as more iterations are needed to achieve better quality. In contrast, our model uses single iteration, i.e., the encoding/decoding complexity is fixed, for any rates. Moreover, our model can produce any point in the RD curve with infinitely fine resolution by tuning the continuous rateadaptive parameter, the quantization bin size . However, the RNNbased model can produce only finite points in the RD curve, depending on how many bits it encodes in each recurrent stage.
In Figure 9, we compare our variablerate networks optimized for MSE, MSSSIM and combined MSE+MSSSIM losses, respectively. We also plotted the results from our fixedrate networks trained for fixed and . Observe that our variablerate network performs very near to the ones individually optimized for fixed and . Here, we emphasize that our variablerate network optimized for MSE performs better than BPG in both PNSR and MSSSIM measures.
Ground truth  
Latent representation  
Latent representation  
# bits assigned to in arithmetic coding  
# bits assigned to in arithmetic coding  
Bits per pixel (BPP)  1.8027  0.8086  0.6006  0.4132  0.1326 
PSNR (dB)  41.3656  36.2535  34.8283  33.1478  29.2833 
MSSSIM  0.9951  0.9863  0.9819  0.9737  0.9249 
Figure 10 shows the compressed images generated from our variablerate model to assess their visual quality. We also depicted the number of bits (implicitly) used to represent each element of and in arithmetic coding, which are and , respectively, in (12)–(14). We randomly selected two and four channels from and , respectively, and showed the code length for each latent representation value in the figure. As we change conditioning parameters and , we can adapt the arithmetic code length that determines the rate of the latent representation. Observe that the smaller the values of and/or , the resulting latent representation requires more bits in arithmetic coding and the rate increases, as expected.
6 Conclusion
This paper proposed a variablerate image compression framework with a conditional autoencoder. Unlike the previous learned image compression methods that train multiple networks to cover various rates, we train and deploy one variablerate model that provides two knobs to control the rate, i.e., the Lagrange multiplier and the quantization bin size, which are given as input to the conditional autoencoder model. Our experimental results showed that the proposed scheme provides better performance than the classical image compression codecs such as JPEG2000 and BPG. Our method also showed comparable and sometimes better performance than the recent learned image compression methods that outperform BPG but need multiple networks trained for different compression rates. We finally note that the proposed conditional neural network can be adopted in deep learning not only for image compression but also in general to solve any optimization problem that can be formulated with the method of Lagrange multipliers.
References
 [1] (2017) Softtohard vector quantization for endtoend learning compressible representations. In Advances in Neural Information Processing Systems, pp. 1141–1151. Cited by: §1, §1.
 [2] (2016) Density modeling of images using a generalized normalization transformation. In International Conference on Learning Representations, Cited by: Figure 7.
 [3] (2017) Endtoend optimized image compression. In International Conference on Learning Representations, Cited by: §1, §1, §2, Remark 2.

[4]
(2018)
Variational image compression with a scale hyperprior
. In International Conference on Learning Representations, Cited by: §1, §1, §2, §4, §5, Remark 2.  [5] (2014) BPG image format. Note: https://bellard.org/bpg Cited by: §1, §5.
 [6] (2018) Universal deep neural network compression. In NeurIPS Workshop on Compact Deep Neural Network Representation with Industrial Applications (CDNNRIA), Cited by: Remark 1.
 [7] (2012) Elements of information theory. John Wiley & Sons. Cited by: §1, §2, §2.
 [8] (2001) Theoretical foundations of transform coding. IEEE Signal Processing Magazine 18 (5), pp. 9–21. Cited by: §1.

[9]
(2018)
Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 4385–4393. Cited by: §1, §1, §5, §5.  [10] (2015) Adam: a method for stochastic optimization. In International Conference on Learning Representations, Cited by: §5.
 [11] (2014) Autoencoding variational Bayes. In International Conference on Learning Representations, Cited by: Remark 2.
 [12] (1993) Kodak lossless true color image suite (PhotoCD PCD0992). Note: http://r0k.us/graphics/kodak Cited by: §1, §5.
 [13] (2015) Deep learning. Nature 521 (7553), pp. 436–444. Cited by: §1.
 [14] (2019) Contextadaptive entropy model for endtoend optimized image compression. In International Conference on Learning Representations, Cited by: §1, §1, §1, §2, §5, §5.
 [15] (2018) Conditional probability models for deep image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4394–4402. Cited by: §1, §1, §5.
 [16] (2018) Joint autoregressive and hierarchical priors for learned image compression. In Advances in Neural Information Processing Systems, pp. 10794–10803. Cited by: Table 1, §1, §1, §1, §A, §A, §2, §5, §5, Remark 5.
 [17] (1998) Ratedistortion methods for image and video compression. IEEE Signal Processing Magazine 15 (6), pp. 23–50. Cited by: §2.
 [18] (2002) JPEG2000: image compression fundamentals, standards and practice. Journal of Electronic Imaging 11 (2), pp. 286. Cited by: §1.

[19]
(2017)
Realtime adaptive image compression.
In
Proceedings of the International Conference on Machine Learning
, pp. 2922–2930. Cited by: §1, §5.  [20] (2015) ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115 (3), pp. 211–252. Cited by: §5.
 [21] (2015) Learning structured output representation using deep conditional generative models. In Advances in Neural Information Processing Systems, pp. 3483–3491. Cited by: §1.
 [22] (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22 (12), pp. 1649–1668. Cited by: §1.
 [23] (2017) Lossy image compression with compressive autoencoders. In International Conference on Learning Representations, Cited by: §1, §1.
 [24] (2017) Full resolution image compression with recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314. Cited by: §1, §1.

[25]
(2018)
VAE with a VampPrior.
In
International Conference on Artificial Intelligence and Statistics
, pp. 1214–1223. Cited by: Remark 5.  [26] (2016) Conditional image generation with PixelCNN decoders. In Advances in Neural Information Processing Systems, pp. 4790–4798. Cited by: §1, §4, Figure 7, Remark 3.
 [27] (1992) The JPEG still picture compression standard. IEEE Transactions on Consumer Electronics 38 (1), pp. xviii–xxxiv. Cited by: §1.
 [28] (2003) Multiscale structural similarity for image quality assessment. In Asilomar Conference on Signals, Systems & Computers, Vol. 2, pp. 1398–1402. Cited by: §1.
 [29] (1992) On universal quantization by randomized uniform/lattice quantizers. IEEE Transactions on Information Theory 38 (2), pp. 428–436. Cited by: §2.
 [30] (1985) On universal quantization. IEEE Transactions on Information Theory 31 (3), pp. 344–347. Cited by: §2.
A Comparison of our refined probabilistic model to [16]
B More example images
As supplementary materials, we provide more example images produced by our variablerate image compression network that is optimized for the MSE loss. We compare our method to the classical image compression codecs, i.e., JPEG, JPEG2000, and BPG. We adapt and match the compression rate of our variablerate network to the rate of BPG by adjusting the Lagrange multiplier and the quantization bin size . All the examples show that our method outperforms the stateoftheart BPG codec in both PSNR and MSSSIM measures at the same bits per pixel (BPP). Visually, our method provides better quality with less artifacts than the classical image compression codecs. We put orange boxes to highlight the visual differences in Figure 11 and Figure 13, and the orangeboxed areas are magnified in Figure 12 and Figure 14, respectively.
Ground truth  Ours BPP: 0.2078, PSNR: 32.4296 (dB), MSSSIM: 0.9543  BPG (4:4:4) BPP: 0.2078, PSNR: 32.0406 (dB), MSSSIM: 0.9488 
JPEG2000 BPP: 0.2092, PSNR: 30.9488 (dB), MSSSIM: 0.9342  JPEG BPP: 0.2098, PSNR: 28.1758 (dB), MSSSIM: 0.8777 
Ground truth  Ours  BPG (4:4:4)  JPEG2000  JPEG 
Bits per pixel (BPP)  0.2078  0.2078  0.2092  0.2098 
PSNR (dB)  32.4296  32.0406  30.9488  28.1758 
MSSSIM  0.9543  0.9488  0.9342  0.8777 
Ground truth  
Ours BPP: 0.1289, PSNR: 34.4543 (dB), MSSSIM: 0.9695  BPG (4:4:4) BPP: 0.1289, PSNR: 33.3546 (dB), MSSSIM: 0.9593 
JPEG2000 BPP: 0.1298, PSNR: 31.8927 (dB), MSSSIM: 0.9482  JPEG BPP: 0.1299, PSNR: 27.1270 (dB), MSSSIM: 0.8404 
Ground truth  Ours  BPG (4:4:4)  JPEG2000  JPEG 
Bits per pixel (BPP)  0.1289  0.1289  0.1298  0.1299 
PSNR (dB)  34.4543  33.3546  31.8927  27.1270 
MSSSIM  0.9695  0.9593  0.9482  0.8404 
Comments
There are no comments yet.