Humans are still the best lossy image compressors

10/25/2018 ∙ by Ashutosh Bhown, et al. ∙ Stanford University 4

Lossy image compression has been studied extensively in the context of typical loss functions such as RMSE, MS-SSIM, etc. However, it is not well understood what loss function might be most appropriate for human perception. Furthermore, the availability of massive public image datasets appears to have hardly been exploited in image compression. In this work, we perform compression experiments in which one human describes images to another, using publicly available images and text instructions. These image reconstructions are rated by human scorers on the Amazon Mechanical Turk platform and compared to reconstructions obtained by existing image compressors. In our experiments, the humans outperform the state of the art compressor WebP in the MTurk survey on most images, which shows that there is significant room for improvement in image compression for human perception. The images, results and additional data is available at https://compression.stanford.edu/human-compression.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 7

page 9

page 13

page 14

page 15

page 16

page 17

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

References

  • [1] Greg Roelofs, PNG: The Definitive Guide, O’Reilly & Associates, Inc., Sebastopol, CA, USA, 1999.
  • [2] Gregory K Wallace, “The JPEG still picture compression standard,” Communications of the ACM, vol. 34, no. 4, pp. 30–44, 1991.
  • [3] David Taubman and Michael Marcellin, JPEG2000 Image Compression Fundamentals, Standards and Practice, Springer Publishing Company, Incorporated, 2013.
  • [4] “JPEG XR,” https://jpeg.org/jpegxr/, Accessed: 2018-10-22.
  • [5] “BPG,” https://bellard.org/bpg/, Accessed: 2018-10-22.
  • [6] “Webp,” https://developers.google.com/speed/webp/, Accessed: 2018-10-16.
  • [7] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
  • [8] Zhou Wang, Eero P Simoncelli, and Alan C Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003. Ieee, 2003, vol. 2, pp. 1398–1402.
  • [9] Troy Chinen, Johannes Ballé, Chunhui Gu, Sung Jin Hwang, Sergey Ioffe, Nick Johnston, Thomas Leung, David Minnen, Sean O’Malley, Charles Rosenberg, et al., “Towards a semantic perceptual image metric,” in 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018, pp. 624–628.
  • [10] Michael Buhrmester, Tracy Kwang, and Samuel D Gosling, “Amazon’s mechanical turk: A new source of inexpensive, yet high-quality, data?,” Perspectives on psychological science, vol. 6, no. 1, pp. 3–5, 2011.
  • [11] Thomas Richter and Kil Joong Kim, “A ms-ssim optimal jpeg 2000 encoder,” in Data Compression Conference, 2009. DCC’09. IEEE, 2009, pp. 401–410.
  • [12] Johannes Ballé, Valero Laparra, and Eero P Simoncelli, “End-to-end optimized image compression,” arXiv preprint arXiv:1611.01704, 2016.
  • [13] Giaime Ginesu, Maurizio Pintus, and Daniele D Giusto, “Objective assessment of the webp image coding algorithm,” Signal Processing: Image Communication, vol. 27, no. 8, pp. 867–874, 2012.
  • [14] “butteraugli,” https://github.com/google/butteraugli/, Accessed: 2018-10-16.
  • [15] Troy Chinen, Johannes Ballé, Chunhui Gu, Sung Jin Hwang, Sergey Ioffe, Nick Johnston, Thomas Leung, David Minnen, Sean O’Malley, Charles Rosenberg, et al., “Towards a semantic perceptual image metric,” in 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018, pp. 624–628.
  • [16] Eirikur Agustsson, Michael Tschannen, Fabian Mentzer, Radu Timofte, and Luc Van Gool, “Generative adversarial networks for extreme learned image compression,” arXiv preprint arXiv:1804.02958, 2018.
  • [17] Didier Le Gall, “MPEG: A video compression standard for multimedia applications,” Communications of the ACM, vol. 34, no. 4, pp. 46–58, 1991.
  • [18] A. No and T. Weissman, “Rateless lossy compression via the extremes,” IEEE Transactions on Information Theory, vol. 62, no. 10, pp. 5484–5495, Oct 2016.
  • [19] “bzip2,” http://www.bzip.org/, Accessed: 2018-10-16.
  • [20] Claude E Shannon, “Prediction and entropy of printed english,” Bell system technical journal, vol. 30, no. 1, pp. 50–64, 1951.