ZFP compression supported?

Hello,

In VOL-REST: memory-side hyperslab unsupported?, it was mentioned that the selected filters are applied inside HSDS. Now I am wondering which filters are currently supported? Is the H5Z-ZFP filter based on the ZFP library ( LLNL/zfp: Compressed numerical arrays that support high-speed random access (github.com) supported? If not would it be possible to add this filter? There is a python interface of ZFP library.

There is no rush for it. It would be a nice addition to HSDS.

Best regards,
Jan-Willem

Hey,

You can see which compressors are supported by HSDS when you do a GET requests to a domain:

E.g.: curl http://localhost:5101/?domain=/home/test_user1/tall.h5

This will return a JSON dictionary with the key: “compressors” and value: [“blosclz”, “lz4”, “lz4hc”, “gzip”, “zstd”, “deflate”].

So no ZFP support it seems. :frowning:

Most of the filters supported by HSDS come from the numcodecs package. So having ZFP supported in numcodecs would make enabling ZFP in HSDS a piece of cake.

There’s a long discussion about this in the numcodes github: https://github.com/zarr-developers/numcodecs/issues/117. I haven’t read through the entire issue, but seems like ZFP uses an approach that makes it harder to implement with numcodecs, but maybe you have some thoughts here.

Another angle would be just directly using the ZFP package in HSDS. All the compression code is in hsds/util/storUtil.py, so it should be straightforward to add ZFP. If you’d like to make a pull request with the changes, I’ll be happy to review and merge the code.

Hi. We have added support for ZFP codec in Blosc2 a while ago (Announcing Support for Lossy ZFP Codec as a Plugin for C-Blosc2 | Blosc Main Blog Page). So, a way to use it would be via Blosc2. We are trying to provide support for Blosc2 in numcodecs (Preliminary version of Blosc2 module by FrancescAlted · Pull Request #463 · zarr-developers/numcodecs · GitHub), but it is not there yet.

Blosc2 is also supported in the hdf5plugin (Contribute — hdf5plugin documentation), but it is still kind of experimental, so you cannot use ZFP from there right out of the box. But we will be updating that soon.

The advantage of using ZFP from Blosc2 is that the latest adds multidimensional layers for storing its internal partitions, not only at a logical level, but also at a physical one (see Introducing Blosc2 NDim | Blosc Main Blog Page). This is great for allowing ZFP to compress more efficiently because it is meant to leverage duplication in multidimensional datasets.

Cheers,
Francesc Alted

Hello @jreadey and @faltet,

Thanks for all the information. It sounds like we have some option for supporting ZFP.

Best regards,
Jan-Willem