New compression plugin based on Snappy-CUDA

lucasvr · August 14, 2021, 4:08am

Hi, folks!

As part of a validation study related to computational storage with HDF5, I ended up writing an I/O filter for data (de)compression using GPUs. The I/O filter is based on Mohammad Dashti’s wonderful snappy-CUDA project.

The filter works as intended, so perhaps it can be useful to you, too. Please visit the project page at https://github.com/lucasvr/snappy-cuda if you’re interested in giving it a try.

Have fun!
Lucas

gheber · August 17, 2021, 3:30pm

Will you register a filter ID with The HDF Group? G.

byrn · August 17, 2021, 3:39pm

Not sure I know, Barbara used to handle that.

Allen

lucasvr · August 17, 2021, 5:44pm

Now that you’ve asked me, I realized that I don’t have a filter ID registered for HDF5-UDF. If possible, I’d like to register both. What’s the preferred way to do so?

The filter ID I’ve been using for HDF5-UDF is 31300.
The ID I’ve temporarily assigned to Snappy-CUDA is 31301.

Thanks,
Lucas

epourmal · August 17, 2021, 6:30pm

If the filter is compatible with Snappy ID 32003 (see https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins).

If not, The HDF Group will issue new ID 32023

Under compatibility I mean that the current Snappy filer can be used to decompress Snappy-CUDA compressed data and vs. versa.

lucasvr · August 17, 2021, 6:45pm

That’s a good question, as I don’t see a link to download the source code of the current Snappy filter. I’ll contact the original author to ask him for a pointer.

epourmal · August 17, 2021, 11:20pm

epourmal · August 17, 2021, 11:21pm

This page has the links to different implementations.

lucasvr · August 18, 2021, 2:13am

Thanks Elena. Judging from the implementation of cuda-c’s uncompressor alone, the two implementations are equivalent. However, since we don’t have access to the source code of the existing Snappy I/O filter, I can’t tell if the two filters are compatible. I’d like to know if Snappy ID 32003 prepends other metadata to the beginning of each compressed block, for instance.

I sent an email to Michael Rissi asking him for directions to download the filter. If I don’t hear from him until the end of the week then I think it’s safer to use a new ID.

Best regards,
Lucas

epourmal · August 18, 2021, 2:35am

I know Mike. Please let me know if he doesn’t respond. I’ll try to reach him.

Thank you!
Elena

lucasvr · August 18, 2021, 12:40pm

Hi Elena,

I just got a response from Mike. It looks like the information on the portal is innacurate, as the filter they developed was built on LZ4 as opposed to Snappy:

Oh that was quite some time ago… I never implemented the one using snappy, as LZ4 has a better compression and higher speed than snappy.
The LZ4 code, we handed over to the HDF5 group:
HDF5-External-Filter-Plugins/LZ4 at master · nexusformat/HDF5-External-Filter-Plugins · GitHub

Given that there’s no existing implementation we can just reuse the same filter ID 32003. I will update the code accordingly.

epourmal · August 18, 2021, 5:44pm

Great! Please send me the updated links, etc. and I will update filters table on the portal website.

Thank you!

lucasvr · August 19, 2021, 5:45pm

Hi Elena,

We can reuse most of the wording that describes Snappy already. Here’s a suggested description for the new filter. Please feel free to adjust as needed.

Snappy-CUDA Filter

Filter ID: 32003

Filter Description:

Snappy-CUDA is a compression/decompression library that leverages GPU processing power to compress/decompress data. The Snappy compression algorithm does not aim for maximum compression or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, the reference implementation of Snappy on the CPU is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger.

Links:
https://github.com/lucasvr/snappy-cuda
https://github.com/google/snappy

Contact Information:

Lucas C. Villa Real
Email: lucasvr at gmail dot com

Thanks!
Lucas

epourmal · August 19, 2021, 8:49pm

Done. See https://confluence.hdfgroup.org/display/support/Filters

lucasvr · August 19, 2021, 8:57pm

Thank you, Elena! It looks good.

streetboys885 · August 31, 2021, 12:48pm

Elena, thank you very much! It appears to be in good condition.

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

New compression plugin based on Snappy-CUDA