Hello everyone,
In our project (Fortran) we save our data in the hdf5 format and write it in parallel. In a current development step we want to make use of the compression filters. Generally we write out 3D scalar data and sometimes also 3D vector data (in the form of a 4D data set).
I could make this work without problems for the gzip filter thanks to the instructions found online (setting the filter, hyperslab and chunking).
But when I now want to do the same for the szip filter I see the following behaviour:
- code runs/writes without any errors or warnings
- I can open the file
- the scalar fields contain the correct data
- the vector fields seem to be empty (if I open them with HDFCompass, ParaView or VisIt)
A typical warning/error in e.g. ParaView would be:
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 139763220634752:
#000: …/…/…/src/H5Dio.c line 199 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: …/…/…/src/H5Dio.c line 601 in H5D__read(): can’t read data
major: Dataset
minor: Read failed
#002: …/…/…/src/H5Dchunk.c line 2229 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: …/…/…/src/H5Dchunk.c line 3609 in H5D__chunk_lock(): data pipeline read failed
major: Dataset
minor: Filter operation failed
#004: …/…/…/src/H5Z.c line 1326 in H5Z_pipeline(): filter returned failure during read
major: Data filters
minor: Read failed
#005: …/…/…/src/H5Zszip.c line 322 in H5Z_filter_szip(): szip_filter: decompression failed
major: Resource unavailable
minor: No space available for allocation
I tested my code by writing output on my laptop (HDF5 version 1.10.4) and on our cluster (HDF5 version 1.12.0), with different pixels_per_block values, but in all cases the outcome is the same, so I am thinking now, that either …
- … I am missing something in my code, what is needed for szip
- … 4D data in the form of 3xLxMxN does not work with szip
- … this is a bug in HDF5
At the moment I am out of ideas what I could further test/change, so I am happy for any hints how I might be able to resolve this (or if it is even possible). Unfortunately I couldn’t find any related topic when searching the forum.
Thank you,
Felix
The relevant code section (I deleted all the error checks to shorten it):
! this is just some example initialisation (!)
! here offset is due to first proc of 2 MPI processes
datadims = (/ 3, 32, 32, 32 /)
datadims_loc = (/ 3, 32, 32, 16 /)
dataoffset_loc = (/ 0, 0, 0, 0 /)
data_rank = 4
! Open file etc. ...
! ...
! Create data space for data set
call H5SCREATE_SIMPLE_F(data_rank, datadims, filespace, error)
! Chunk size is chosen to be the local domain of the local process
chunkdims_loc = datadims_loc
call H5SCREATE_SIMPLE_F(data_rank, chunkdims_loc, memspace, error)
! Create chunked dataset.
call H5PCREATE_F(H5P_DATASET_CREATE_F, plist_id, error)
! Set szip:
call H5PSET_SZIP_F(plist_id, H5_SZIP_NN_OM_F, 32, error)
! Set chunk
call H5PSET_CHUNK_F(plist_id, data_rank, chunkdims_loc, error)
! Create the dataset
call H5DCREATE_F(file_id, fieldname, memtype_id, filespace, dset_id, error, plist_id)
call H5SCLOSE_F(filespace, error)
! Select hyperslab in the file.
call H5DGET_SPACE_F(dset_id, filespace, error)
! Chunk parameters are chosen to previously made decision "local domain = one chunk"
allocate(chunk_count(data_rank), chunk_stride(data_rank))
do i = 1, data_rank
chunk_count(i) = 1
chunk_stride(i) = 1
end do
call H5SSELECT_HYPERSLAB_F(filespace, H5S_SELECT_SET_F, &
& dataoffset_loc, chunk_count, &
& error, chunk_stride, chunkdims_loc)
deallocate(chunk_count, chunk_stride)
! Create property list for collective dataset write
call H5PCREATE_F(H5P_DATASET_XFER_F, plist_id, error)
call H5PSET_DXPL_MPIO_F(plist_id, H5FD_MPIO_COLLECTIVE_F, error)
! Write data set
call H5DWRITE_F(dset_id, memtype_id, &
vectordata_to_write, &
datadims_loc, error, file_space_id = filespace, &
mem_space_id = memspace, &
xfer_prp = plist_id)
! after this all CLOSE subroutines are called
! ...