pHDF5: problem with szip compression and vector/4D data

felix.schwab · June 21, 2021, 9:20am

Hello everyone,

In our project (Fortran) we save our data in the hdf5 format and write it in parallel. In a current development step we want to make use of the compression filters. Generally we write out 3D scalar data and sometimes also 3D vector data (in the form of a 4D data set).

I could make this work without problems for the gzip filter thanks to the instructions found online (setting the filter, hyperslab and chunking).
But when I now want to do the same for the szip filter I see the following behaviour:

code runs/writes without any errors or warnings
I can open the file
the scalar fields contain the correct data
the vector fields seem to be empty (if I open them with HDFCompass, ParaView or VisIt)

A typical warning/error in e.g. ParaView would be:

HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 139763220634752:
#000: …/…/…/src/H5Dio.c line 199 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: …/…/…/src/H5Dio.c line 601 in H5D__read(): can’t read data
major: Dataset
minor: Read failed
#002: …/…/…/src/H5Dchunk.c line 2229 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: …/…/…/src/H5Dchunk.c line 3609 in H5D__chunk_lock(): data pipeline read failed
major: Dataset
minor: Filter operation failed
#004: …/…/…/src/H5Z.c line 1326 in H5Z_pipeline(): filter returned failure during read
major: Data filters
minor: Read failed
#005: …/…/…/src/H5Zszip.c line 322 in H5Z_filter_szip(): szip_filter: decompression failed
major: Resource unavailable
minor: No space available for allocation

I tested my code by writing output on my laptop (HDF5 version 1.10.4) and on our cluster (HDF5 version 1.12.0), with different pixels_per_block values, but in all cases the outcome is the same, so I am thinking now, that either …

… I am missing something in my code, what is needed for szip
… 4D data in the form of 3xLxMxN does not work with szip
… this is a bug in HDF5

At the moment I am out of ideas what I could further test/change, so I am happy for any hints how I might be able to resolve this (or if it is even possible). Unfortunately I couldn’t find any related topic when searching the forum.

Thank you,
Felix

The relevant code section (I deleted all the error checks to shorten it):

! this is just some example initialisation (!)
! here offset is due to first proc of 2 MPI processes
datadims       = (/ 3, 32, 32, 32 /)
datadims_loc   = (/ 3, 32, 32, 16 /)
dataoffset_loc = (/ 0, 0, 0, 0 /)
data_rank      = 4

! Open file etc. ...
! ...

! Create data space for data set
call H5SCREATE_SIMPLE_F(data_rank, datadims, filespace, error)

! Chunk size is chosen to be the local domain of the local process
chunkdims_loc = datadims_loc
call H5SCREATE_SIMPLE_F(data_rank, chunkdims_loc, memspace, error)

! Create chunked dataset.
call H5PCREATE_F(H5P_DATASET_CREATE_F, plist_id, error)

! Set szip:
call H5PSET_SZIP_F(plist_id, H5_SZIP_NN_OM_F, 32, error)

! Set chunk
call H5PSET_CHUNK_F(plist_id, data_rank, chunkdims_loc, error)

! Create the dataset
call H5DCREATE_F(file_id, fieldname, memtype_id, filespace, dset_id, error, plist_id)

call H5SCLOSE_F(filespace, error)

! Select hyperslab in the file.
call H5DGET_SPACE_F(dset_id, filespace, error)

! Chunk parameters are chosen to previously made decision "local domain = one chunk"
allocate(chunk_count(data_rank), chunk_stride(data_rank))
do i = 1, data_rank
  chunk_count(i)  = 1
  chunk_stride(i) = 1
end do
call H5SSELECT_HYPERSLAB_F(filespace, H5S_SELECT_SET_F, &
                                      & dataoffset_loc, chunk_count, &
                                      & error, chunk_stride, chunkdims_loc)
deallocate(chunk_count, chunk_stride)

! Create property list for collective dataset write
call H5PCREATE_F(H5P_DATASET_XFER_F, plist_id, error)
call H5PSET_DXPL_MPIO_F(plist_id, H5FD_MPIO_COLLECTIVE_F, error)

! Write data set
call H5DWRITE_F(dset_id, memtype_id, &
                               vectordata_to_write, &
                               datadims_loc, error, file_space_id = filespace, &
                                                    mem_space_id  = memspace,  &
                                                    xfer_prp      = plist_id)

! after this all CLOSE subroutines are called
! ...

brtnfld · June 21, 2021, 3:50pm

Does h5dump work on the file? What is memtype_id? You seem to be having all the ranks writing to the same place in the file. Are you setting the fill time too never?

This example works, so you can check your setup. You can compile it with h5pfc.

hyperslab_by_chunk.f90 (4.7 KB)

brtnfld · June 21, 2021, 4:30pm

Also, are you sure that the viz. tools have an HDF5 version with szip support? szip is not enabled by default.

felix.schwab · June 21, 2021, 11:33pm

Thank you @brtnfld for your answers.
I’ll checked the following so far:

Also, are you sure that the viz. tools have an HDF5 version with szip support? szip is not enabled by default.

Yes, so I can display szip-compressed scalar data. Hence I think vector data shouldn’t be a problem as well.
I checked this as well with the .h5-file generated by your code: I opened it via a .xmf-File in VisIt it the vectors were drawn just fine.

Does h5dump work on the file?

Yes, and for the scalar data all information is shown correctly, but for the vector data it says

DATASET "u" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 32, 32, 32, 3 ) / ( 32, 32, 32, 3 ) }
      DATA {h5dump error: unable to print data

      }
   }

What is memtype_id?

This variable holds the information on the type of the data, meaning e.g. integer or real and its precision. In this case I do the following:

memtype_id = h5kind_to_type(RK, H5_REAL_KIND) ! RK = REAL64

You seem to be having all the ranks writing to the same place in the file.

Do you mean because in my example there is only one offset given? Of course in the real code this differs (every MPI process has its own offset). In the example I gave in my initial post I just wanted to put some information without making it too complicated.

Are you setting the fill time too never?

I do not understand, what you mean by this. Which time do you mean?

I tried your example code, and it compiles perfectly for me and gives a vector field which I can display with h5dump without any error. I also modified your code and tried e.g. H5T_NATIVE_DOUBLE which worked as well.
Strangely I had problems if I wanted to define a variable memtype_id (as I do in my code), assigned the type to that variable and then use the variable as input to the subroutines. The errors clearly say that the type is not recognised:

HDF5-DIAG: Error detected in HDF5 (1.10.4) MPI-process 2:
  #000: ../../../src/H5D.c line 121 in H5Dcreate2(): not a datatype ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) MPI-process 2:
  #000: ../../../src/H5D.c line 372 in H5Dget_space(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) MPI-process 2:
  #000: ../../../src/H5S.c line 921 in H5Sget_simple_extent_ndims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type

So I changed my code and directly used H5T_NATIVE_DOUBLE as an argument to the subroutines, but this didn’t solve my problem so far. And apart from that I can’t see any difference between your code and my code at the moment. But I will have a second look tomorrow, sometimes a little distance surely helps.

brtnfld · June 22, 2021, 2:36am

Did you declare memtype_type INTEGER(HID_T)?

H5Pset_fill_time is the API I was referencing, and you usually will want to set that to H5D_FILL_TIME_NEVER if you know that you will be writing the whole dataset for better performance.

felix.schwab · June 22, 2021, 11:49am

Did you declare memtype_type INTEGER(HID_T)?

Sorry I forgot to mention this, but yes it is declared as a HID_T integer.

Thanks as well for the performance tip, I changed that now in our code. But so far this didn’t help with the general vector data output.

But I think I narrowed down the problem:

I compared my code again to yours, and I compared it subroutine by subroutine, and in that sense it is identical. I call the same subroutines with the same logic.
Just for test purposes I changed my output data set from my “scientific” one (double precision, value range from zero ~1.0e-16 to 5) to a very simple data set based on integers. And then it worked!
Then I converted my double precision data to single precision, and it worked as well.
But if I again use my double precision data (or convert it to C_DOUBLE) it stops to work again.

In all of this my memtype_id stayed “REAL64”. So I am a little bit confused now what could be the reason? I think I shouldn’t need to convert to single precision to make it work. But maybe my Fortran defined precision somehow doesn’t always work with the HDF5 precisions?

brtnfld · June 22, 2021, 2:38pm

I’ve not been able to reproduce an error with the modified program. I’m not aware of any Fortran precision that is not allowable in HDF5. Maybe there is a NaN or Inf in your original data?

Can you run h5dump with the options --enable-error-stack and -pH on your output file.

hyperslab_by_chunk.f90 (4.9 KB)

felix.schwab · June 22, 2021, 4:21pm

Maybe there is a NaN or Inf in your original data?

I checked that with the IEEE_IS_NAN and IEEE_IS_FINITE functions, but all data entries are fine.

Can you run h5dump with the options --enable-error-stack and -pH on your output file.

Yes sure, the output for my vector data is:

   DATASET "u" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 32, 32, 32, 3 ) / ( 32, 32, 32, 3 ) }
      STORAGE_LAYOUT {
         CHUNKED ( 16, 32, 32, 3 )
         SIZE 786432 (1.000:1 COMPRESSION)
      }
      FILTERS {
         COMPRESSION SZIP {
            PIXELS_PER_BLOCK 32
            MODE K13
            CODING NEAREST NEIGHBOUR
            BYTE_ORDER LSB
            HEADER RAW
         }
      }
      FILLVALUE {
         FILL_TIME H5D_FILL_TIME_NEVER
         VALUE  H5D_FILL_VALUE_DEFAULT
      }
      ALLOCATION_TIME {
         H5D_ALLOC_TIME_EARLY
      }
   }

And for comparison, the working scalar data looks like this:

   DATASET "rho" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 32, 32, 32 ) / ( 32, 32, 32 ) }
      STORAGE_LAYOUT {
         CHUNKED ( 16, 32, 32 )
         SIZE 13320 (19.680:1 COMPRESSION)
      }
      FILTERS {
         COMPRESSION SZIP {
            PIXELS_PER_BLOCK 32
            MODE K13
            CODING NEAREST NEIGHBOUR
            BYTE_ORDER LSB
            HEADER RAW
         }
      }
      FILLVALUE {
         FILL_TIME H5D_FILL_TIME_NEVER
         VALUE  H5D_FILL_VALUE_DEFAULT
      }
      ALLOCATION_TIME {
         H5D_ALLOC_TIME_EARLY
      }
   }

The only difference I see (besides the additional dimension) is that there is also no compression for the vector data. Does this give a hint?

I also tried your modified code for double precision data. It works perfectly for me, which puzzles me even more.
Based on your code I tried to change the definition of double precision in my code, but it didn’t help.

felix.schwab · September 17, 2021, 4:39pm

After a longer time I tried to work on this topic again, and what I see now is the following:

When I use my code, which I thought does not work, and I fill each position in the vector field with the same components (as it is more or less done in the example code of @brtnfld), the output is readable without any problems.
So I assume, that the problem is perhaps more connected with the fact that my data could be basically in every point and in every vector component different, and that this somehow kills szip for vectordata?

Perhaps someone has an idea on this viewpoint?

felix.schwab · September 20, 2021, 3:14pm

I think I narrowed down the problem a little bit more:
If I modify the code of @brtnfld a little bit (see attached file), meaning specifically that I alter the filling process to the following,

 small = TINY(1.0_dp)*1.d287

 do k = 1, chunk_dims(4)
   do j = 1, chunk_dims(3)
     do i = 1, chunk_dims(2)
       do l = 1, chunk_dims(1)
         small = small * -1.0d0
         DATA(l,i,j,k) = REAL(mpi_rank+(l*i*j*k),dp) * small
         write(*,*) DATA(l,i,j,k)
       end do
     end do
   end do
 end do

which fills varying numbers with values of about 1.0d-19, I get the same error messages as I get with my code (see initial post). Hence I assume that the szip compression might have a problem with values close to the precision limit?
If I change the precision, e.g. to single precision as already written, or write values in the range of 1.0d-16, the code starts to work again.

Does someone has an idea on this fact?
And the moment I only see the unsatisfactory alternative to convert the numbers to a lower precision to get szip work properly.

hyperslab_test.f90 (5.2 KB)

epourmal · September 21, 2021, 3:52pm

Which szip library are you using? Have you tried Open Source one https://gitlab.dkrz.de/k202009/libaec?

felix.schwab · September 21, 2021, 6:16pm

I am not 100% sure, but I assume as I use the HDF5 library version 1.10.4, that szip version 2.1.1 is used. Is there a command to check this? Basically I use the HDF5 library which is provided for Ubuntu 20.04 (.3 LTS). I haven’t used a different szip source so far, as I didn’t compile it myself so far (and actually I also want to stick with a standard installation).

Just because I am curious: If you use the code I attached to my previous post, do you get the same errors when you try opening the output file?

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

pHDF5: problem with szip compression and vector/4D data