HDF5 read error when reading huge file

In both cases, the library attempts to read from addresses beyond the end-of-allocation (eoa), which is rather small (2048) and doesn’t make sense for the file size you’ve quoted. Assuming that the file wasn’t closed properly, it’s likely that certain elements of the superblock weren’t updated. You can obtain a dump of the first 128 bytes by running this:

od -x -N 128 your_file_name

What does that look like?

G.

This is the ‘od’ command output:

0000000 4889 4644 0a0d 0a1a 0000 0000 0800 0008
0000020 0004 0010 0001 0000 0000 0000 0000 0000
0000040 ffff ffff ffff ffff 0800 0000 0000 0000
0000060 ffff ffff ffff ffff 0000 0000 0000 0000
0000100 0060 0000 0000 0000 0001 0000 0000 0000
0000120 0088 0000 0000 0000 02a8 0000 0000 0000
0000140 0001 0004 0001 0000 0018 0000 0000 0000
0000160 0010 0010 0000 0000 0320 0000 0000 0000
0000200

-Kat

For reference, here’s the same dump for h5ex_t_vlstring.h5 from our examples collection:

0000000 4889 4644 0a0d 0a1a 0000 0000 0800 0008
0000020 0004 0010 0000 0000 0000 0000 0000 0000
0000040 ffff ffff ffff ffff 1840 0000 0000 0000
0000060 ffff ffff ffff ffff 0000 0000 0000 0000
0000100 0060 0000 0000 0000 0001 0000 0000 0000
0000120 0088 0000 0000 0000 02a8 0000 0000 0000
0000140 0001 0001 0001 0000 0018 0000 0000 0000
0000160 0011 0010 0000 0000 0088 0000 0000 0000
0000200

The file size is 6,208 bytes or 0x1840. Looking at the file format specification you can spot the End of File Address following the Address of File Free space Info, which is ffff ffff ffff ffff, in this example.

In your example,

0000000 4889 4644 0a0d 0a1a 0000 0000 0800 0008
0000020 0004 0010 0001 0000 0000 0000 0000 0000
0000040 ffff ffff ffff ffff 0800 0000 0000 0000
0000060 ffff ffff ffff ffff 0000 0000 0000 0000
0000100 0060 0000 0000 0000 0001 0000 0000 0000
0000120 0088 0000 0000 0000 02a8 0000 0000 0000
0000140 0001 0004 0001 0000 0018 0000 0000 0000
0000160 0010 0010 0000 0000 0320 0000 0000 0000
0000200

The End of File Address is 0x0800 or 2,048 (bytes).

You can try correcting this by hand when running the app. in a debugger or using your favorite binary editor on a copy of the file.

There could be other issues, but that’d be a start.

G.

Hi Gerd,

Thanks for the details here.

We were able to confirm that the writing application was indeed terminated using SIGTERM signal.

I would like to know if you have any guidance on what a reasonable approach would be when the writing application is terminated in this manner.

  1. Should the HDF5 file be removed, and some sort of warning logged so the end user is aware as to what happened?
  2. I looked up the HDF5 documentation and it seems I could call H5Fflush(H5File::getId(), H5F_SCOPE_GLOBAL) to flush the in-memory buffers to disk. Is this recommended?

Thanks for your insights.

-Kat.

Both are sensible steps to take. How effective this can be, depends a lot on the specifics of the disruption. If it’s not IO-related and HDF5 library structures (in user space!) weren’t compromised, there’s a good chance that flushing (and closing!) will leave things in a sane state. If it is IO-related, e.g., disk full, failed device, or (temporarily) lost connection, the chances of gracefully exiting might be slim. The assumption should be that the HDF5 library has no logic for “taking evasive action.” If a call fails, it fails, and the error stack will have a record of that, but any retry logic or state sanity assessment is on the application.

G.

Hi Gerd,

Thanks for your insights.

Also, we were able to reproduce the abnormal termination of the writing application (which resulted in the corrupt HDF5 file). It is due to an assertion in HDF5 library:

H5C.c:6732: H5C_load_entry: Assertion `entry->size < ((size_t)(32 * 1024 * 1024))’ failed.

The (32 * 1024 * 1024) expression is defined as H5C_MAX_ENTRY_SIZE.

The assertion occurs during the H5File::createDataSet call.

It looks like a similar discussion had happened regarding this assertion on this thread:

But I don’t see any conclusions on that thread.

Could you please let me know what this assertion means and how to overcome it?

Thanks
-Kat.

Can you describe the dataset you are trying to create (datatype, layout, rank, extent, etc.)? G.

Hi Gerd,

The dataset uses compound data type – POD structs with numeric (size_t, int, float, double), string and boolean values. All our datasets have RANK = 1. No chunking (yet).

In this case, the writing application creates ~4M groups and in each group, there would be 2 sub-groups. At each sub-group, there would be about 20 sub-groups. The above-mentioned datasets reside at this level.

So, the structure would be something like:

/net1/group1/layer{1…20}/dataset1
/net1/group2/layer{1…20}/dataset2
.
.
.
/net4000000/group1/layer{1…20}/dataset1
/net4000000/group2/layer{1…20}/dataset2

Each dataset could contain 10s or 100s of millions of entries.

Thanks
-Kat.

Can you reproduce the error at will? Can you provide us with a reproducer? None of the things you are describing sound unusual. The definition’s comment sounds a little equivocating.

/* This sanity-checking constant was picked out of the air.  Increase
 * or decrease it if appropriate.  Its purpose is to detect corrupt
 * object sizes, so it probably doesn't matter if it is a bit big.
 */
#define H5C_MAX_ENTRY_SIZE ((size_t)(32 * 1024 * 1024))

It suggests that we don’t expect cache entries to be big (32 MiB), and nothing from your description suggests anything near that. My hunch is that it has nothing to do with the H5File::createDataSet call but that some corruption (“detect corrupt object sizes”) is occurring in your application or somewhere in the library.

G.

For the given case, I’m able to consistently reproduce the assertion in HDF5 library. I’m not sure if I’d have the bandwidth to try and create a standalone reproducer but will try to do so in the next week or so.

I know that the writer application does not have Valgrind issues like Invalid Read/Write errors. Out of curiosity, I re-ran Valgrind and noticed this:

===
==400120== Syscall param pwrite64(buf) points to uninitialised byte(s)
==400120== at 0x12799FC3: ??? (in /usr/lib64/libpthread-2.17.so)
==400120== by 0x432FAB7: H5FD_sec2_write (H5FDsec2.c:816)
==400120== by 0x43273C8: H5FD_write (H5FDint.c:248)
==400120== by 0x460D996: H5F__accum_write (H5Faccum.c:826)
==400120== by 0x4465781: H5PB_write (H5PB.c:1031)
==400120== by 0x4304040: H5F_block_write (H5Fio.c:251)
==400120== by 0x426A9BA: H5C__flush_single_entry (H5C.c:6109)
==400120== by 0x4272611: H5C__make_space_in_cache (H5C.c:6961)
==400120== by 0x42735A7: H5C_insert_entry (H5C.c:1458)
==400120== by 0x423B279: H5AC_insert_entry (H5AC.c:810)
==400120== by 0x43ED434: H5O__apply_ohdr (H5Oint.c:548)
==400120== by 0x43F40DA: H5O_create (H5Oint.c:316)
==400120== by 0x42A6D53: H5D__update_oh_info (H5Dint.c:1030)
==400120== by 0x42A9C64: H5D__create (H5Dint.c:1373)
==400120== by 0x46071A5: H5O__dset_create (H5Doh.c:300)
==400120== by 0x43F1FB9: H5O_obj_create (H5Oint.c:2521)
==400120== by 0x43AB717: H5L__link_cb (H5L.c:1850)
==400120== by 0x43651E9: H5G__traverse_real (H5Gtraverse.c:629)
==400120== by 0x4365F80: H5G_traverse (H5Gtraverse.c:854)
==400120== by 0x43A37ED: H5L__create_real (H5L.c:2044)
==400120== by 0x43AD96E: H5L_link_object (H5L.c:1803)
==400120== by 0x42A8E28: H5D__create_named (H5Dint.c:410)
==400120== by 0x45A9051: H5VL__native_dataset_create (H5VLnative_dataset.c:74)
==400120== by 0x458409F: H5VL__dataset_create (H5VLcallback.c:1834)
==400120== by 0x458E19C: H5VL_dataset_create (H5VLcallback.c:1868)
==400120== by 0x42991AC: H5Dcreate2 (H5D.c:150)
==400120== by 0x41FBB9C: H5::H5Location::createDataSet(char const*, H5::DataType const&, H5::DataSpace const&, H5::DSetCreatPropList const&, H5::DSetAccPropList const&, H5::LinkCreatPropList const&) const (H5Location.cpp:932)
==400120== by 0x41FBD78: H5::H5Location::createDataSet(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, H5::DataType const&, H5::DataSpace const&, H5::DSetCreatPropList const&, H5::DSetAccPropList const&, H5::LinkCreatPropList const&) const (H5Location.cpp:958)

// Rest of writer application stack

===

Is this something that should be addressed? If so, could you suggest how? There is only one occurrence of this issue that Valgrind reports.

Thanks
-Kat

(The experts will correct me…) I think this is nothing to lose sleep over. When a new object (e.g., dataset) is created, it’s linked into the group structure, and an object header is created. Furthermore, the metadata cache is updated to have things on hand when needed. If you dig into the code, various structures with array fields may get only partially initialized (the arrays). I think that’s what valgrind is calling out here.

G.

(OK, we don’t have the state of the metadata cache in your application…)
We could try to reproduce the error by creating just the dataset you’re dealing with. What’s the type and shape of that dataset, and what are the creation properties?

G.

Hi Gerd,

The dataset for which the valgrind error occurs is like I’d mentioned above:

  • Compound data type with int, float, double and boolean values for the various fields.
  • Single dimension and using default dataset creation properties (no chunking and hence no compresssion).

Have created a writer program that shows the Valgrind issue. Please let me know how I can send the tar ball to you.

Thanks
-Kat

Have another update regarding H5C_MAX_ENTRY_SIZE definition. I experimented with changing its definition to:

#define H5C_MAX_ENTRY_SIZE ((size_t)(64 * 1024 * 1024))

Rebuilt the HDF5 library with this definition and re-ran the writer application. This time, a valid HDF5 file was generated which could be read in. But the generation of ~347G file took about 22hrs which seems quite excessive.

Is this change to H5C_MAX_ENTRY_SIZE something that you would recommend incorporating at least temporarily? A customer of ours is evaluating our flow and would like to have them move forward with the HDF5 write/read.

Like I mentioned in the previous post, I have the stripped-down version of writer code that I can share. Please let me know where to send it.

Thanks
-Kat

Hi Gerd,

Could you please share your thoughts on the above 2 posts?

Thanks
-Kat

Can you try a newer library version? HDF5 1.12.0 is relatively old and will EOL soon. We fixed several CVE issues that involved the writing of uninitialized data. I’d recommend 1.14.1 (release imminent) or 1.10.10. If you have to stick to 1.12, use 1.12.2.

G.

Nothing unusual here. How many fields does your compound have? How long are the (ASCII-only) field names?

G.

Yes. (Excessive…) How are you acquiring/writing the data? Element by element?

No. There’s something fishy here; we better figure out what’s happening.

I’ve shared a link.

G.

The compound type typically has about 20-30 fields. Field names are fairly small – about 10-20 characters.

Thanks
-Kat

All data for a dataset is acquired prior to writing it.

I have uploaded the sample writer program in the link you provided.

Thanks
-Kat.