Zstd: ZSTD_c_targetCBlockSize == ZSTD_TARGETCBLOCKSIZE_MAX leads to improbably bad compression

Created on 28 Apr 2020  ·  3Comments  ·  Source: facebook/zstd

Using ZSTD_CCtxParams_setParameter(cctxParams, ZSTD_c_targetCBlockSize, ZSTD_TARGETCBLOCKSIZE_MAX) leads to improbably bad compression for sources with large minimum sequences (>1kB).

Using a repeated concatenation of a minified CSS:
With setting targetCBlockSize == ZSTD_TARGETBLOCKSIZE_MAX:
bootstrap.min.css : 97.31% (13862462 => 13488900 bytes, bootstrap.min.css.zst)
Without setting targetCBlockSize:
bootstrap.min.css : 0.15% (13862462 => 20476 bytes, bootstrap.min.css.zst)

Further note: I encountered this as a result of trying to reduce the block size on decompress regarding #2093. If instead of setting the TargetCompressedBlockSize, ZSTD_BLOCKSIZEMAX itself is reduced, in turn causing the maximum blocksize to be reduced as well, it has a much smaller impact on the compression.


The source used was obtained via the following sequence:
rm bootstrap.min.css; wget https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css && for i in {1..5}; do cat bootstrap.min.css >> bootstrap_2.min.css; cat bootstrap_2.min.css >> bootstrap.min.css; done && rm bootstrap_2.min.css

bug release-blocking

Most helpful comment

@dciliske I've reproduced the issue at level 22. It is not present in zstd-1.4.4, so it never made it into a release. It looks like I introduced it in https://github.com/facebook/zstd/pull/1947. I'll fix it soon.

All 3 comments

I'm going to ask a question here regarding architecture: does targetCBlockSize cause the compressor to assume that it is going to be fed into a decompressor who will have no knowledge outside of the block it receives? i.e. it is a fully bufferless streaming decompressor?

targetCBlockSize is "meant" to be used in streaming scenarios where you want to reduce the time to decompress the first byte. So if your packet size is 4KB you could set your target block size to 4KB and attempt to make each packet decompressible, instead of having to wait for a full 128KB before decompressing the first byte.

I'm not sure whats going on in this scenario, but with the input it should be easy to reproduce and fix. I will look into it soon. Thanks again for the report and the detailed repro instructions!

@dciliske I've reproduced the issue at level 22. It is not present in zstd-1.4.4, so it never made it into a release. It looks like I introduced it in https://github.com/facebook/zstd/pull/1947. I'll fix it soon.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

itsnotvalid picture itsnotvalid  ·  3Comments

vade picture vade  ·  3Comments

g666gle picture g666gle  ·  3Comments

AbdulrahmanAltabba picture AbdulrahmanAltabba  ·  3Comments

animalize picture animalize  ·  3Comments