-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I added stream & multi-thread support for libdeflate #335
Comments
Now,I supported compression_level>=10 add test result:
|
@sisong How do I compile your |
libdeflate-pgzip streaming decompression, beats igzip (from ISA-L) decompression speed (the latter is/was the fastest gzip streaming decompressor that I know of). Quite impressive!
One thing that doesn't work is decompression gzip files with multiple gzip headers, as only the first gzip stream is decompressed. echo "1" | gzip > multiple_gzip_header.gz
echo "2" | gzip >> multiple_gzip_header.gz
echo "3" | gzip >> multiple_gzip_header.gz
$ /software/libdeflate_streaming/build/programs/libdeflate-pgzip -cd multiple_gzip_header.gz
1
# Standard gzip does not have problems with those files.
$ zcat multiple_gzip_header.gz
1
2
3
# igzip of ISA-L works too.
$ /software/isa-l/programs/igzip -cd multiple_gzip_header.gz
1
2
3 |
@ghuls
I haven't used CMake; |
now stream-mt update libdeflate base to v1.20;
|
Hello, Can you add this zlib version too? https://dougallj.wordpress.com/2022/08/20/faster-zlib-deflate-decompression-on-the-apple-m1-and-x86/ |
@osevan zlib-dougallj runs performance tests with the previous programs, and the other tests' results have almost no change.
|
Ok big thx |
And what is with rapidgzip? |
@osevan |
@sisong Could you share exactly how you compile and build pgzip? |
@ghuls |
now, I submitted my build environment, including: vc, xcode, and MakeFile
You can also download the pgzip executable that I compiled directly: https://github.com/sisong/libdeflate/releases |
@sisong Thanks for adding the makefile. I managed to compile it successfully. As mentioned earlier, pgzip does not support decompressing concatenated gzip files (like the BGZF format, commonly used in bioinformatics). It would be great if pgzip could support decompressing concatenated gzip files, as that would make pgzip very useful in pipelines as it supports streaming. # Create file with multiple concatenated gzip archives.
$ echo "1" | gzip > multiple_gzip_header.gz
$ echo "2" | gzip >> multiple_gzip_header.gz
$ echo "3" | gzip >> multiple_gzip_header.gz
# gzip can decompress the full file.
$ zcat multiple_gzip_header.gz
1
2
3
# libdeflate gzip decompressed the full file.
$ ./build/programs/libdeflate-gzip -c -d multiple_gzip_header.gz
1
2
3
# pgzip only decompresses the first gzip archive.
$ ./pgzip/pgzip -c -d multiple_gzip_header.gz
1 |
@ghuls |
@sisong Thanks! It now works for BGZF compressed files I tried so far. Some timings: # BGZF compressed file (concatenated gzipped files).
$ file bgzipped.fastq.gz
bgzipped.fastq.gz: gzip compressed data, extra field, original size 10822
# Decompression time with standard gzip.
module load gzip/1.12
$ timeit gzip -cd bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: gzip -cd bgzipped.fastq.gz
* Elapsed wall time: 1:15.27 = 75.27 seconds
* Elapsed CPU time:
- User: 72.81
- Sys: 2.29
* CPU usage: 99%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 9
- Involuntarily (time slice expired): 417
* Maximum resident set size (RSS: memory) (kiB): 1920
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 216
- # of outputs: 0
* Exit status: 0
384498832
# Decompression time with pigz with standard zlib.
module load pigz/2.7
$ timeit pigz -cd bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pigz -cd -cd bgzipped.fastq.gz
* Elapsed wall time: 0:44.13 = 44.13 seconds
* Elapsed CPU time:
- User: 39.82
- Sys: 14.25
* CPU usage: 122%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 1405471
- Involuntarily (time slice expired): 3186
* Maximum resident set size (RSS: memory) (kiB): 2384
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 304
- # of outputs: 0
* Exit status: 0
384498832
# Decompression time with pigz with zlib-ng.
module load pigz/2.7
module load zlib-ng/2.1.6
$ timeit pigz -cd bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pigz -cd bgzipped.fastq.gz
* Elapsed wall time: 0:30.60 = 30.60 seconds
* Elapsed CPU time:
- User: 24.79
- Sys: 14.18
* CPU usage: 127%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 1410511
- Involuntarily (time slice expired): 2367
* Maximum resident set size (RSS: memory) (kiB): 2452
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832
# Decompression time with igzip of ISA-L.
module load ISA-L/2.30.0
$ timeit igzip -cd bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: igzip -cd bgzipped.fastq.gz
* Elapsed wall time: 0:16.45 = 16.45 seconds
* Elapsed CPU time:
- User: 14.01
- Sys: 2.39
* CPU usage: 99%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 25
- Involuntarily (time slice expired): 111
* Maximum resident set size (RSS: memory) (kiB): 3236
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832
# Decompression time with pgzip.
$ timeit pgzip/pgzip -cd bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pgzip/pgzip -cd bgzipped.fastq.gz
* Elapsed wall time: 0:12.59 = 12.59 seconds
* Elapsed CPU time:
- User: 16.48
- Sys: 18.51
* CPU usage: 277%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 572318
- Involuntarily (time slice expired): 137
* Maximum resident set size (RSS: memory) (kiB): 4252
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832 Decompressing with 2 threads seems to give the best performance (and lower CPU usage than the default 4): $ timeit pgzip/pgzip -cd -p 1 bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pgzip/pgzip -cd -p 1 bgzipped.fastq.gz
* Elapsed wall time: 0:13.15 = 13.15 seconds
* Elapsed CPU time:
- User: 11.18
- Sys: 1.94
* CPU usage: 99%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 11
- Involuntarily (time slice expired): 63
* Maximum resident set size (RSS: memory) (kiB): 3092
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832
$ timeit pgzip/pgzip -cd -p 2 bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pgzip/pgzip -cd -p 2 bgzipped.fastq.gz
* Elapsed wall time: 0:12.14 = 12.14 seconds
* Elapsed CPU time:
- User: 13.21
- Sys: 8.40
* CPU usage: 178%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 565213
- Involuntarily (time slice expired): 63
* Maximum resident set size (RSS: memory) (kiB): 3820
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832
$ timeit pgzip/pgzip -cd -p 3 bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pgzip/pgzip -cd -p 3 bgzipped.fastq.gz
* Elapsed wall time: 0:12.58 = 12.58 seconds
* Elapsed CPU time:
- User: 16.30
- Sys: 18.63
* CPU usage: 277%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 573049
- Involuntarily (time slice expired): 87
* Maximum resident set size (RSS: memory) (kiB): 4288
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832
$ timeit pgzip/pgzip -cd -p 4 bgzipped.fastq.gz | wc -l
Time output:
------------
* Command: pgzip/pgzip -cd -p 4 bgzipped.fastq.gz
* Elapsed wall time: 0:12.57 = 12.57 seconds
* Elapsed CPU time:
- User: 16.35
- Sys: 18.62
* CPU usage: 278%
* Context switching:
- Voluntarily (e.g.: waiting for I/O operation): 566209
- Involuntarily (time slice expired): 80
* Maximum resident set size (RSS: memory) (kiB): 4232
* Number of times the process was swapped out of main memory: 0
* Filesystem:
- # of inputs: 0
- # of outputs: 0
* Exit status: 0
384498832 |
Thank you for sharing the libdeflate, it's great!
My project want run on phone, so I add some API to libdeflate for support compress&decompress by stream (ref #19), & support compress by multi-thread (ref #40);
And at the same time, try to keep it simple and fast.
With these modifications at stream_mt, I rewrote gzip.c to pgzip.c for testing stream and multi-thread parallel.
it's can run ok when compression_level<=9, but got a bad compress ratio when compression_level>=10; Because I don't know how to rebuild the hash dictionary for bt_matchfinder.I added func bt_matchfinder_skip_bytes(), it only simple loop call bt_matchfinder_skip_byte(), so it's fail.
I need some help, How to implement bt_matchfinder_skip_bytes()? it's similar ht_matchfinder_skip_bytes() or hc_matchfinder_skip_bytes().
(now, all supported)
current work progress, some files for compression testing:
test PC: Windows11, CPU R9-7945HX, SSD PCIe4.0x4 4T, DDR5 5200MHz 32Gx2
Program version: zlib v1.2.13, gzip in libdeflate v1.19, pgzip in stream_mt based on libdeflate v1.19
Only test deflate compress & decompress, no crc; build by vc2022; The time counted includes the time of read & write file data;
-p-16
means compressor run with 16 threads.Note: C ratio=sum(gzfile)/sum(srcfile)
The text was updated successfully, but these errors were encountered: