h5py and szip/aec filter on Windows


#1

Hi there,

already in advance, I am relatively new to Python and HDF5.
After hours of googling and trying I wasn’t able to use h5py and encoding with SZIP (or AEC) on Windows. Using Visual Studio and C everything runs fine, but for an university project I have to use HDF with Python, not with C.
Is there anyone here who already has experience with h5py on Windows, or is it even possible to use encoding with h5py?

Thank you for your help!

Jan


#2

Can you provide more information? What errors are you getting from h5py? Can you share a short example code of what you are trying to do? Are your HDF5 and SZIP libraries correctly installed and configured?

-Aleksandar


#3

Hi Aleksandar,

thanks for your reply. I just tried to get SZIP running in Python with a simple example for test purposes, in my case a numpy array:
dset = group.create_dataset('test', data=np.zeros(10000, dtype=np.float64), fletcher32=True, compression="szip")
When running this, I get an error that SZIP is not available, so I assume that something in my setup or configuration is not correct. Currently I’m using Anaconda, but I’ve read that h5py distributed by Anaconda is built with SZIP disabled and I should use pip instead. But even with the pip version I can’t get my code running.

So first of all I’m not sure if there is a possibility to use SZIP with Python on Windows? I also tried to build h5py against one of my hdf5 versions installed with the binary installers provided on the website, but it completely destroyed my Anaconda setup (to be honest, I didn’t really know what I was doing).
Because since hdf5 1.10.7 AEC is replacing SZIP, so for me it would also be fine to use the AEC library. Is that already implemented in h5py? If yes, how can I call the AEC filter with h5py? I can’t find any reliable information about that in the docs.

In my short C example I’m able to use SZIP, therefore I think that in principle my setup should be correct.

Thanks for your answers!

Jan


#4

Hi!

I think you are close, it’s just a matter of carefully aligning all the pieces of software.

You seem to have successfully installed both the HDF5 library with SZIP filter enabled and the SZIP (or AEC) library. Anaconda is by design a closed ecosystem so doing what you tried is not easy. Below is the information that may help you achieve that:

I’d suggest you to create a brand new conda environment and follow the above instructions. You will need to define or modify env. variables to include your HDF5 and SZIP library install locations before any other on your computer. Then custom build h5py.

-Aleksandar


#5

Hi @ajelenak,

thank you again for your support!
I’ve already tried to build h5py with the steps in the docs, but something failed and I got a lot of DLL errors. But I’ll try it again with your steps and share my experience here. In my last approach I think I’ve used the wrong environment variables. Furthermore, afterwards I had issues with numpy because it was built against a different version of the HDF5 library.

One more question to h5py: After building it against 1.10.7/1.12.0, will it use the AEC filter instead of the SZIP filter automatically using the ‘szip’ keyword? Or do I have to use another value for the compression argument in create_dataset?

All best
Jan


#6

I asked the conda-forge maintainer about szip support and was told it was omitted due to
the “non-commercial” license clause. Cygwin64 does provide libaec and szip compression
is available in NetCDF4 and CDO from Cygwin64.

With Windows 10 you can also get libaec and szip compression with Debian unstable in WSL2.