Thread-parallel compression filters? - Feature request

There are a few applications now that have implemented thread parallel compression and decompression:


One trick is to use H5Dread_chunk or H5Dwrite_chunk. This will allow you to read or write the chunk directly in its compressed form. You can then setup the thread-parallel compression or decompression yourself.

Another approach is using H5Dget_chunk_info to query the location of a chunk within the file. H5Dchunk_iter provides a faster way to do this, particularly if you want to get this information for all the chunks, but this is a relatively new API function.

The source for many of the filters is located in the following repository.

For example, the code for the Zstd filter is here:

From the source code there, you can see it simply uses ZSTD_decompress or ZSTD_compress.

It would be pretty easy to swap that out for ZSTD_compressCCtx or ZSTD_decompressDCtx and provide the parameter ZSTD_c_nbWorkers to use multiple threads per chunk. However, I suspect that having multiple threads deal with individual chunks may be more efficient. This depends on your chunking scheme.
http://facebook.github.io/zstd/zstd_manual.html#Chapter4