Mixing use of ZLib and SZip compression

When using HDF5, is it possible to mix ZLib and SZip compression ? If yes how ?

My understanding is that:
1. ZLib can compress any data (char *, int, double, ...)
2. SZip is dedicated to number compression only (float, double)

According to HDF5 doc, we have:
1. to use H5Pset_deflate to use ZLib (deflate algorithm)
2. to use H5Pset_szip to use SZip

My understanding is that, if I use both H5Pset_deflate and H5Pset_szip, then:
1. data (whatever they are) which are NOT numbers will be compressed with ZLib
2. numbers will be compressed with SZip

Is this correct ? Or did I get things wrong ?

Thanks,

FH

When using HDF5, is it possible to mix ZLib and SZip compression ? If
yes how ?

Yes. You can have a single hdf5 file with some datasets that are compressed with zlib and others compressed with szip.

A single dataset compressed with both zlib and szip? I imagine that *might* be possible. Never tried it. Not sure why you'd want to do it. But, I can't think if a reason HDF5 might balk at it except if they've added logic to explicitly forbid it. For reasons you mention below, don't think szip *after* zlib would "work" at all. But, zlib *after* szip might.

My understanding is that:
1. ZLib can compress any data (char *, int, double, ...)

Yes, its a byte-level compressor. Doesn't care if those bytes comprise an array of floats, doubles, ints, chars, etc.

2. SZip is dedicated to number compression only (float, double)

I honestly can't recall but that sounds plausible/right.

According to HDF5 doc, we have:
1. to use H5Pset_deflate to use ZLib (deflate algorithm)
2. to use H5Pset_szip to use SZip

Yes. Though, take care to read licensing limitations regarding szip and confirm you're workflows involving it meet its requirements.

My understanding is that, if I use both H5Pset_deflate and H5Pset_szip,
then:
1. data (whatever they are) which are NOT numbers will be compressed
with ZLib
2. numbers will be compressed with SZip

I don't think it works that way. If you apply *both* filters to a dataset, HDF5 will apply each filter in order. Though, since zlib and szip are sort of built-in compressors, maybe HDF5 library has some logic to handle them specially? If not *and* if you want the behavior you describe here. Its easy to impliment your own sort of merged zlib/szip filter yourself that does something like…

  1. Check data type. If type is double or float, apply szip, else apply zlib.

Hope that helps. I am 99% certain what I've just written is accurate :wink:

Mark

···

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of houssen <houssen@ipgp.fr<mailto:houssen@ipgp.fr>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Monday, January 11, 2016 2:00 AM
To: "hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>" <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: [Hdf-forum] Mixing use of ZLib and SZip compression

Check data type, if type is double or float apply szip, else apply zlib : sounds perfect to me !

Thanks,

Franck

Note : don't know why but I thought H5Pset_deflate / H5Pset_szip where supposed to be applied on the whole file (I got this wrong)

···

Le 2016-01-11 16:45, Miller, Mark C. a écrit :

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org [1]> on behalf
of houssen <houssen@ipgp.fr [2]>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org
[3]>
Date: Monday, January 11, 2016 2:00 AM
To: "hdf-forum@lists.hdfgroup.org [4]" <hdf-forum@lists.hdfgroup.org
[5]>
Subject: [Hdf-forum] Mixing use of ZLib and SZip compression

When using HDF5, is it possible to mix ZLib and SZip compression ?
If
yes how ?

Yes. You can have a single hdf5 file with some datasets that are
compressed with zlib and others compressed with szip.

A single dataset compressed with both zlib and szip? I imagine that
*might* be possible. Never tried it. Not sure why you'd want to do it.
But, I can't think if a reason HDF5 might balk at it except if they've
added logic to explicitly forbid it. For reasons you mention below,
don't think szip *after* zlib would "work" at all. But, zlib *after*
szip might.

My understanding is that:
1. ZLib can compress any data (char *, int, double, ...)

Yes, its a byte-level compressor. Doesn't care if those bytes
comprise an array of floats, doubles, ints, chars, etc.

2. SZip is dedicated to

can't recall but that sounds plausible/right.

"MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE" style="BORDER-LEFT: #b5c4df 5
solid; PADDING:0 0 0 5; MARGIN:0 0 0 5;">

According to HDF5 doc, we have:
1. to use H5Pset_deflate to use ZLib (deflate algorithm)
2. to use H5Pset_szip to use SZip

volving it meet its requirements.

My understanding is that, if I use both H5Pset_deflate and
H5Pset_szip,
then:
1. data (whatever they are) which are NOT numbers will be compressed
with ZLib
2. numbers will be compressed with SZip

I don't think it works that way. If you apply *both* filter

DF5 library has some logic to handle them specially? If not *and*

if

you want the behavior you describe here. Its easy to impliment your
own sort of merged zlib/szip filter yourself that does something
like…

* Check data type. If type is double or float, apply szip, else

>

Hope that helps. I am 99% certain what I've just written is accurate
:wink:

Mark

Links:
------
[1] mailto:hdf-forum-bounces@lists.hdfgroup.org
[2] mailto:houssen@ipgp.fr
[3] mailto:hdf-forum@lists.hdfgroup.org
[4] mailto:hdf-forum@lists.hdfgroup.org
[5] mailto:hdf-forum@lists.hdfgroup.org