Hi, I am developing the independent library PureHDF with write support for HDF5 files. I use h5dump to verify that the written files conform to the spec. With h5dump 1.14.1 I was able to create a file with a dataset with null dataspace and h5dump was able to dump it:
HDF5 "<file-path>" {
GROUP "/" {
DATASET "Null" {
DATATYPE H5T_STD_I32LE
DATASPACE NULL
DATA {
}
}
}
}
do you happen to have that file available somewhere? We do test NULL dataspaces, but generally the library shouldn’t have allocated space in the file for a 0-sized contiguous dataset, so the check should normally be skipped. For example, here’s one of the test files we have that has a dataset with a NULL dataspace:
Notice the OFFSET HADDR_UNDEF part, which means that file space isn’t allocated and the check is skipped. What does the output of h5dump -pH look like for your file? It’s possible there’s a bug either in older versions or the current version of HDF5 with regards to file space allocation for 0-sized datasets, but I also think the use of H5_addr_le may have been unintentional and H5_addr_lt may have been intended instead.
If space allocation is generally not allowed in combination with a null dataspace, it would be great to add this information to the spec document (HDF5: HDF5 File Format Specification Version 3.0) so others will not run into the same issue.
To avoid confusion: I am not creating files using the C-library but instead using my independent implementation. I was simply following the HDF5 spec and ran into this issue. To avoid this for other library authors and to align the spec with the reality I would appreciate if the condition null dataspace = no allocation allowed would become part of that spec
It’s an interesting question for sure. Considering the behavior of previous versions of the library, I’m tempted to say that there shouldn’t be any reason for this to not be allowed and that the library should be fixed. When you previously allocated file space for the dataset, was there a particular fixed address given to the dataset (since you would have been allocating a 0-byte region)?
Thanks for the pointer! It may have been that this was intentionally left unspecified in the file format specification, but we plan on discussing this internally to determine a resolution on it. In the meantime, I’ll fix the newly-added check in the library because I believe it’s an ill-formed check in either case.