Stream VByte: Faster Byte-Oriented Integer Compression

09/25/2017
by   Daniel Lemire, et al.
0

Arrays of integers are often compressed in search engines. Though there are many ways to compress integers, we are interested in the popular byte-oriented integer compression techniques (e.g., VByte or Google's Varint-GB). They are appealing due to their simplicity and engineering convenience. Amazon's varint-G8IU is one of the fastest byte-oriented compression technique published so far. It makes judicious use of the powerful single-instruction-multiple-data (SIMD) instructions available in commodity processors. To surpass varint-G8IU, we present Stream VByte, a novel byte-oriented compression technique that separates the control stream from the encoded data. Like varint-G8IU, Stream VByte is well suited for SIMD instructions. We show that Stream VByte decoding can be up to twice as fast as varint-G8IU decoding over real data sets. In this sense, Stream VByte establishes new speed records for byte-oriented integer compression, at times exceeding the speed of the memcpy function. On a 3.4GHz Haswell processor, it decodes more than 4 billion differentially-coded integers per second from RAM to L1 cache.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Flushgeist: Cache Leaks from Beyond the Flush

Flushing the cache, using instructions like clflush and wbinvd, is commo...
research
10/02/2019

Base64 encoding and decoding at almost the speed of a memory copy

Many common document formats on the Internet are text-only such as email...
research
02/28/2022

Stream Containers for Resource-oriented RDF Stream Processing

We introduce Stream Containers inspired by the Linked Data Platform as a...
research
05/31/2023

Minotaur: A SIMD-Oriented Synthesizing Superoptimizer

Minotaur is a superoptimizer for LLVM's intermediate representation that...
research
11/14/2018

Rice-Marlin Codes: Tiny and Efficient Variable-to-Fixed Codes

Marlin is a Variable-to-Fixed (VF) codec optimized for high decoding spe...
research
06/10/2021

Stream processors and comodels

In 2009, Ghani, Hancock and Pattinson gave a coalgebraic characterisatio...
research
03/30/2017

Faster Base64 Encoding and Decoding Using AVX2 Instructions

Web developers use base64 formats to include images, fonts, sounds and o...

Please sign up or login with your details

Forgot password? Click here to reset