A set of LZ codecs, with reference implementations, designed for very fast branchless decompression on modern x64 processors. LZSSE offers some of the fastest open source decompression of its kind, with decompression rates on modern processors often exceeding multiple gigabytes per second per core.
LZSS designed for a branchless SSE decompression implementation.
- LZSSE2, for high compression files with small literal runs
- LZSSE4, for a more balanced mix of literals and matches
- LZSSE8, for lower compression data with longer runs of matches
All three variants have an optimal parser implementation, which uses a quite strong match finder (very similar to LzFind) combined with a Storer-Szymanksi style parse. LZSSE4 and LZSSE8 have “fast” compressor implementations, which use a simple hash table based matching and a greedy parse.
Currently LZSSE8 is the recommended variant to use in the general case, as it generally performs well in most cases (and you have the option of both optimal parse and fast compression). LZSSE2 is recommended if you are only using text, especially heavily compressible text, but is slow/doesn’t compress as well on less compressible data and binaries.
The code is approaching production readiness and LZSSE2 and LZSSE8 have received a reasonable amount of testing.
See these blog posts An LZ Codec Designed for SSE Decompression and Compressor Improvements and LZSSE2 vs LZSSE8 for a description of how the compression algorithm and implementation function. There are also benchmarks, but these may not be up-to-date (in particular the figures in the initial blog post no longer represent compression performance).