Zstd: Zstd multithreaded output can depend on number of threads

Created on 25 Sep 2020  ·  3Comments  ·  Source: facebook/zstd

Describe the bug
As reported by @animalize in Issue #2238:

When using ZSTD_e_end end directive and output buffer size >= ZSTD_compressBound() the job number is calculated by ZSTDMT_computeNbJobs() function. This function produces a different number of jobs depending on nbWorkers:

https://github.com/facebook/zstd/blob/b706286adbba780006a47ef92df0ad7a785666b6/lib/compress/zstdmt_compress.c#L1243-L1255

Expected behavior
The output of zstd multithreaded compression must be independent of the number of threads.

Fix

  • [ ] Make ZSTDMT_computeNbJobs() independent of nbWorkers.
  • [ ] Add a fuzz test that checks that the output of multithreaded zstd is always independent of the number of threads.

Workaround
If you need to work around this bug, don't start your streaming job with ZSTD_e_end. Pass at least one byte of input with ZSTD_e_continue before calling ZSTD_e_end, or ensure your output buffer is < ZSTD_compressBound(inputSize).

bug

Most helpful comment

This could be less disruptive than trying to adapt the single-pass MT compressor,
which was never designed to offer this guarantee.

Yeah, that is probably easier. I had forgotten that all the jobs in the single pass MT compressor needed to be launched at once.

I once wanted to propose adding a ZSTD_compressStream3() function, that is always blocking in multithreaded compression.

Generally, the way people write streaming compression loops, it shouldn't be terribly inconvenient to not make maximal forward progress. If we were to add something like this, it wouldn't require a new API. We'd probably just need to add a compression parameter to control it. But, I don't currently see a great need for it.

All 3 comments

It's a shortcut to say that the outcome of multithreaded zstd does not depend on nb of threads.

Actually, the feature supported is that the outcome of _streaming_ multithreaded zstd does not depend on nb of threads
(and that's what is used by the zstd CLI).

This definition makes it possible to consider another potential fix :
do not employ the one-pass shortcut for ZSTD_e_end when nbWorkers >= 1,
since it's the delegation to the one-pass mode which triggers this issue.

This could be less disruptive than trying to adapt the single-pass MT compressor,
which was never designed to offer this guarantee.

Another (potentially positive) side effect is that it would guarantee that streaming multithreaded compression is _always_ non-blocking, since it would no longer delegate to the (blocking) single-pass mode.
_edit_ : scrap that, no longer delegating to the single-pass mode doesn't guarantee non-blocking, since on receiving ZSTD_e_flush and ZSTD_e_end directive, the MT API contract changes from minimal forward progress to maximal progress.

Another (potentially positive) side effect is that it would guarantee that streaming multithreaded compression is always non-blocking, since it would no longer delegate to the blocking mode.

I once wanted to propose adding a ZSTD_compressStream3() function, that is always blocking in multithreaded compression.

If the caller keeps checking the non-blocking progress, it's very inconvenient.

edit: Just found, checking the progress is not very inconvenient:

do {
    zstd_ret = ZSTD_compressStream2(self->cctx, &out, &in, ZSTD_e_continue);
} while (out.pos != out.size && in.pos != in.size && !ZSTD_isError(zstd_ret));

But it's better to have an always blocking ZSTD_compressStream3(), it may be faster a bit, IMO many programmer users don't need to get the compression progress.

This could be less disruptive than trying to adapt the single-pass MT compressor,
which was never designed to offer this guarantee.

Yeah, that is probably easier. I had forgotten that all the jobs in the single pass MT compressor needed to be launched at once.

I once wanted to propose adding a ZSTD_compressStream3() function, that is always blocking in multithreaded compression.

Generally, the way people write streaming compression loops, it shouldn't be terribly inconvenient to not make maximal forward progress. If we were to add something like this, it wouldn't require a new API. We'd probably just need to add a compression parameter to control it. But, I don't currently see a great need for it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

robert3005 picture robert3005  ·  4Comments

planet36 picture planet36  ·  3Comments

dciliske picture dciliske  ·  3Comments

icebluey picture icebluey  ·  3Comments

AbdulrahmanAltabba picture AbdulrahmanAltabba  ·  3Comments