Zstandard compression plug-in

Hello HDF5 group!

Zstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression / speed trade-off, while being backed by a very fast decoder. Zstandard library is provided as open source software using a BSD license.
www.zstd.net

In attachment you can find an implementation of Zstd HDF5 gilter plug-in. My tests confirm the good properties of Zstd compression, even on small chunks.

I'd like the filter binary format to be registered in

Is anything else needed to get the filter ID registered? I think the filter code is trivial, but if explicit license is needed, please let me know.

Best wishes,
Andrey Paramonov

zstd_h5plugin.c (1.66 KB)

zstd_h5plugin.h (456 Bytes)

···

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Hi Andrey,

the zstd plugin has an issue as it does not check for validity of the return values, so if ZSTD_compress returns an error, it is interpreted as a size, returned to HDF5, which interprets the negative number as an insanely large chunk size, leading to weird error messages. Here a corrected version:

        compSize = ZSTD_compress(outbuf, compSize, inbuf, origSize, aggression);

        if (ZSTD_isError(compSize))
        {
                printf("ZSTD-Plugin: (compress %lld bytes) ZSTD ERROR %s!\n", origSize, ZSTD_getErrorName(compSize) );
                fflush(stdout);
                if (outbuf)
                        free(outbuf);

                return 0;
        }

Actually I ran into problems with the zstd library’s FSE tablelog, which seems to not have enough memory. Compiling the zstd library with settings such as

-DFSE_MAX_MEMORY_USAGE=18 -DFSE_TABLELOG_ABSOLUTE_MAX=16

cures that problem. Any ideas what may be the issue? It seems weird that the default settings of the zstd library are unable to compress certain datasets.

         Werner