In both cases, the library attempts to read from addresses beyond the end-of-allocation (eoa), which is rather small (2048) and doesn’t make sense for the file size you’ve quoted. Assuming that the file wasn’t closed properly, it’s likely that certain elements of the superblock weren’t updated. You can obtain a dump of the first 128 bytes by running this:
The file size is 6,208 bytes or 0x1840. Looking at the file format specification you can spot the End of File Address following the Address of File Free space Info, which is ffff ffff ffff ffff, in this example.
We were able to confirm that the writing application was indeed terminated using SIGTERM signal.
I would like to know if you have any guidance on what a reasonable approach would be when the writing application is terminated in this manner.
Should the HDF5 file be removed, and some sort of warning logged so the end user is aware as to what happened?
I looked up the HDF5 documentation and it seems I could call H5Fflush(H5File::getId(), H5F_SCOPE_GLOBAL) to flush the in-memory buffers to disk. Is this recommended?
Both are sensible steps to take. How effective this can be, depends a lot on the specifics of the disruption. If it’s not IO-related and HDF5 library structures (in user space!) weren’t compromised, there’s a good chance that flushing (and closing!) will leave things in a sane state. If it is IO-related, e.g., disk full, failed device, or (temporarily) lost connection, the chances of gracefully exiting might be slim. The assumption should be that the HDF5 library has no logic for “taking evasive action.” If a call fails, it fails, and the error stack will have a record of that, but any retry logic or state sanity assessment is on the application.
Also, we were able to reproduce the abnormal termination of the writing application (which resulted in the corrupt HDF5 file). It is due to an assertion in HDF5 library:
The dataset uses compound data type – POD structs with numeric (size_t, int, float, double), string and boolean values. All our datasets have RANK = 1. No chunking (yet).
In this case, the writing application creates ~4M groups and in each group, there would be 2 sub-groups. At each sub-group, there would be about 20 sub-groups. The above-mentioned datasets reside at this level.
Can you reproduce the error at will? Can you provide us with a reproducer? None of the things you are describing sound unusual. The definition’s comment sounds a little equivocating.
/* This sanity-checking constant was picked out of the air. Increase
* or decrease it if appropriate. Its purpose is to detect corrupt
* object sizes, so it probably doesn't matter if it is a bit big.
*/
#define H5C_MAX_ENTRY_SIZE ((size_t)(32 * 1024 * 1024))
It suggests that we don’t expect cache entries to be big (32 MiB), and nothing from your description suggests anything near that. My hunch is that it has nothing to do with the H5File::createDataSet call but that some corruption (“detect corrupt object sizes”) is occurring in your application or somewhere in the library.
For the given case, I’m able to consistently reproduce the assertion in HDF5 library. I’m not sure if I’d have the bandwidth to try and create a standalone reproducer but will try to do so in the next week or so.
I know that the writer application does not have Valgrind issues like Invalid Read/Write errors. Out of curiosity, I re-ran Valgrind and noticed this:
===
==400120== Syscall param pwrite64(buf) points to uninitialised byte(s)
==400120== at 0x12799FC3: ??? (in /usr/lib64/libpthread-2.17.so)
==400120== by 0x432FAB7: H5FD_sec2_write (H5FDsec2.c:816)
==400120== by 0x43273C8: H5FD_write (H5FDint.c:248)
==400120== by 0x460D996: H5F__accum_write (H5Faccum.c:826)
==400120== by 0x4465781: H5PB_write (H5PB.c:1031)
==400120== by 0x4304040: H5F_block_write (H5Fio.c:251)
==400120== by 0x426A9BA: H5C__flush_single_entry (H5C.c:6109)
==400120== by 0x4272611: H5C__make_space_in_cache (H5C.c:6961)
==400120== by 0x42735A7: H5C_insert_entry (H5C.c:1458)
==400120== by 0x423B279: H5AC_insert_entry (H5AC.c:810)
==400120== by 0x43ED434: H5O__apply_ohdr (H5Oint.c:548)
==400120== by 0x43F40DA: H5O_create (H5Oint.c:316)
==400120== by 0x42A6D53: H5D__update_oh_info (H5Dint.c:1030)
==400120== by 0x42A9C64: H5D__create (H5Dint.c:1373)
==400120== by 0x46071A5: H5O__dset_create (H5Doh.c:300)
==400120== by 0x43F1FB9: H5O_obj_create (H5Oint.c:2521)
==400120== by 0x43AB717: H5L__link_cb (H5L.c:1850)
==400120== by 0x43651E9: H5G__traverse_real (H5Gtraverse.c:629)
==400120== by 0x4365F80: H5G_traverse (H5Gtraverse.c:854)
==400120== by 0x43A37ED: H5L__create_real (H5L.c:2044)
==400120== by 0x43AD96E: H5L_link_object (H5L.c:1803)
==400120== by 0x42A8E28: H5D__create_named (H5Dint.c:410)
==400120== by 0x45A9051: H5VL__native_dataset_create (H5VLnative_dataset.c:74)
==400120== by 0x458409F: H5VL__dataset_create (H5VLcallback.c:1834)
==400120== by 0x458E19C: H5VL_dataset_create (H5VLcallback.c:1868)
==400120== by 0x42991AC: H5Dcreate2 (H5D.c:150)
==400120== by 0x41FBB9C: H5::H5Location::createDataSet(char const*, H5::DataType const&, H5::DataSpace const&, H5::DSetCreatPropList const&, H5::DSetAccPropList const&, H5::LinkCreatPropList const&) const (H5Location.cpp:932)
==400120== by 0x41FBD78: H5::H5Location::createDataSet(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, H5::DataType const&, H5::DataSpace const&, H5::DSetCreatPropList const&, H5::DSetAccPropList const&, H5::LinkCreatPropList const&) const (H5Location.cpp:958)
…
// Rest of writer application stack
===
Is this something that should be addressed? If so, could you suggest how? There is only one occurrence of this issue that Valgrind reports.
(The experts will correct me…) I think this is nothing to lose sleep over. When a new object (e.g., dataset) is created, it’s linked into the group structure, and an object header is created. Furthermore, the metadata cache is updated to have things on hand when needed. If you dig into the code, various structures with array fields may get only partially initialized (the arrays). I think that’s what valgrind is calling out here.
(OK, we don’t have the state of the metadata cache in your application…)
We could try to reproduce the error by creating just the dataset you’re dealing with. What’s the type and shape of that dataset, and what are the creation properties?
Rebuilt the HDF5 library with this definition and re-ran the writer application. This time, a valid HDF5 file was generated which could be read in. But the generation of ~347G file took about 22hrs which seems quite excessive.
Is this change to H5C_MAX_ENTRY_SIZE something that you would recommend incorporating at least temporarily? A customer of ours is evaluating our flow and would like to have them move forward with the HDF5 write/read.
Like I mentioned in the previous post, I have the stripped-down version of writer code that I can share. Please let me know where to send it.
Can you try a newer library version? HDF5 1.12.0 is relatively old and will EOL soon. We fixed several CVE issues that involved the writing of uninitialized data. I’d recommend 1.14.1 (release imminent) or 1.10.10. If you have to stick to 1.12, use 1.12.2.