Sunday, October 18, 2020

Parallel Compressor Performance for Science - pigz, lbzip2, xz

UPDATE: At requet of a friend I looked into zstd and wow it's a great option. As it becomes more ubiquitous it should likely replace most compressors. Compresson similar to xz and speed approaching llz4 for modest cpu increase.

Origonal Post

As data volumes grow and single core performance grows slower than core count, compressing large volumes of data quickly requires the use of compressors that are capable of utilizing multiple cores for keeping up with the data volumes and hardware investments.

Luckily there are several available that are compatible compressors out there, but how do they perform and compare to classic gzip?  Also how well do they work on scientific data?  Often scientific data has a few very large files that are often binary and thousands of small files that are compressible.

The Host

All tests were done on the Great Lakes login node.  The properties of this node are:

  • 36 core 36 thread Intel Xeon 6154 
  • 192 GB Memory
  • 1.9PB GPFS File System 
  • 100Gbps HDR Network

The Data

The data set has the following properties

  • 6649 files
  • 276 directories
  • 221 GB total size

 Range                       Number
[   0.000  B -   0.000  B ) 1
[   0.000  B -   1.000 KB ) 560
[   1.000 KB -   1.000 MB ) 4935
[   1.000 MB -  10.000 MB ) 1175
[  10.000 MB - 100.000 MB ) 116
[ 100.000 MB -   1.000 GB ) 94
[   1.000 GB -  10.000 GB ) 43
[  10.000 GB - 100.000 GB ) 1
[ 100.000 GB -   1.000 TB ) 0
[   1.000 TB -        MAX ) 0


Results

This compares runtime and final archive size as compared to serial gzip.  This was accomplished with 

tar -I pigz -cf myarchive.tar.gz

Command

Compatible

Parallel

Compress

Parallel

Decom.

Speed vs. Gzip

Gzip Size
153 G

gzip

gzip

No

No

1x

153G

pigz

gzip

Yes

No

32x

153G

lbzip2

bzip2

Yes

Yes

23x

151G

mpibzip2

bzip2

Yes

Yes

*

151G

xz -T0

xz/lzma

Yes

No

5.5x

137G

pixz

xz/lzma

Yes

Yes

5.5x

137G 

zstd -T0

zst

Yes

Yes

67x

155G 

lz4

lz4

No

No

42.2x

171G 

Notes

  • pigz can only compress in parallel with very minimal speedup on decompression

  • xz requires -T0 option to use all cores in the system or will default to 1

  • xz cannot decompress files in parallel but pixz can

  • lbzip2 and mpibzip2  can only decompress in parallel if the archive was compressed with a parallel aware compressor  

    lz4 is not parallel aware but is by far the fastest compressor of all, but with the least space savings

    zstd requires -T0 option to use all cores or will default to 1

 Conclusion

Overall using the drop in replacements for gzip and bzip2 are obvious improvements on modern multi-core systems.   While xz and lz4 are available on almost all modern systems they are still less portable than gzip and bz2 based compressors. 

lz4 is very interesting as it's so fast it uses almost no CPU.  If one was collecting data on a lower powered device using lz4 appears to be 'compression for free'.  While not as effective as the other compressors there is almost no performance impact during tar/untar when using lz4. 

One would hope over time the stock installs of gzip and bzip2 are replaced by the parallel versions. Xz is very stable but struggles to utilize very high core counts of modern systems, but still returns the best compression ratio.

No comments:

Post a Comment