Error on H5 reading: heap out of range

Hi all, I’ve found many times an error like this. The procedure it’s always the same but from times to times this happens and I just throw the file, this is an important one so pls help me.

So essentially the error on python is:
RuntimeError: Unable to get group info (addr overflow, addr = 8240700081, size = 328, eoa = 8240590217)

If I try the line command h5ls it doesn’t show anything.
I tried to h5debug filename and it showed some info like Btree address and heap address so i re-do it as
h5debug filename 136 680
(which are the Btree and the heap address respectively)and it gives:
Reading signature at address 680 (rel)
An error occurred!
HDF5-DIAG: Error detected in HDF5 (1.10.7) thread 1:
#000: …/…/…/src/H5HLdbg.c line 66 in H5HL_debug(): unable to load/protect local heap
major: Heap
minor: Unable to protect metadata
#001: …/…/…/src/H5HL.c line 364 in H5HL_protect(): unable to load heap data block
major: Heap
minor: Unable to protect metadata
#002: …/…/…/src/H5AC.c line 1517 in H5AC_protect(): H5C_protect() failed
major: Object cache
minor: Unable to protect metadata
#003: …/…/…/src/H5C.c line 2501 in H5C_protect(): can’t load entry
major: Object cache
minor: Unable to load metadata into cache
#004: …/…/…/src/H5C.c line 7661 in H5C_load_entry(): Can’t deserialize image
major: Object cache
minor: Unable to load metadata into cache
#005: …/…/…/src/H5HLcache.c line 764 in H5HL__cache_datablock_deserialize(): can’t initialize free list
major: Heap
minor: Unable to initialize object
#006: …/…/…/src/H5HLcache.c line 260 in H5HL__fl_deserialize(): bad heap free list
major: Heap
minor: Out of range

What does it mean?

Also I launched the command
strings -t d file.h5
and it seams good, I can see the datas are there (I can’t upload the strings file because it is 2.9 Gb)
Online I keep reading about h5check but 2 major problems: I don’t know why I can’t install it and it says that works only on <2Gb files while mine is around 10Gb.

If you need any more information, tell me. Help me please :wink:

The error in the code is:
HDF5-DIAG: Error detected in HDF5 (1.8.19) MPI-process 0:
#000: H5Ddeprec.c line 191 in H5Dcreate1(): unable to create dataset
major: Dataset
minor: Unable to initialize object
#001: H5Dint.c line 453 in H5D__create_named(): unable to create and link to dataset
major: Dataset
minor: Unable to initialize object
#002: H5L.c line 1636 in H5L_link_object(): unable to create new link to object
major: Links
minor: Unable to initialize object
#003: H5L.c line 1880 in H5L_create_real(): can’t insert link
major: Symbol table
minor: Unable to insert object
#004: H5Gtraverse.c line 859 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#005: H5Gtraverse.c line 594 in H5G_traverse_real(): can’t look up component
major: Symbol table
minor: Object not found
#006: H5Gobj.c line 1154 in H5G__obj_lookup(): can’t locate object
major: Symbol table
minor: Object not found
#007: H5Gstab.c line 905 in H5G__stab_lookup(): not found
major: Symbol table
minor: Object not found
#008: H5B.c line 360 in H5B_find(): can’t lookup key in subtree
major: B-Tree node
minor: Object not found
#009: H5B.c line 360 in H5B_find(): can’t lookup key in subtree
major: B-Tree node
minor: Object not found
#010: H5B.c line 360 in H5B_find(): can’t lookup key in subtree
major: B-Tree node
minor: Object not found
#011: H5B.c line 338 in H5B_find(): unable to load B-tree node
major: B-Tree node
minor: Unable to protect metadata
#012: H5AC.c line 1260 in H5AC_protect(): H5C_protect() failed.
major: Object cache
minor: Unable to protect metadata
#013: H5C.c line 3572 in H5C_protect(): can’t load entry
major: Object cache
minor: Unable to load metadata into cache
#014: H5C.c line 7952 in H5C_load_entry(): unable to load entry
major: Object cache
minor: Unable to load metadata into cache
#015: H5Bcache.c line 141 in H5B__load(): wrong B-tree signature
major: B-Tree node
minor: Unable to load metadata into cache

RuntimeError: Unable to get group info (addr overflow, addr = 8240700081, size = 328, eoa = 8240590217)

Since addr > eoa, you are trying to read past the end-of-allocation. How did you create the file in the first place? A common cause is premature application termination, where the library doesn’t get a chance to flush updates from memory to the file.

G.

How does this relate to the previous post?

G.

The second post is the error the first part of the error message I recieved in output from the code that was building the file, so the whole chain of errors generated here. While in the first post there is the issue in reading the file.

To create the file I use a community code called Einstein Toolkit so I don’t know the details but If you need that I can recover It.
It appears that the I finished the storage available while running the simulation so It was not able to append anymore the data to the file but It continued to open, try and close the file. Could there be some corrupted allocation? If yes, how can I discard them?

Yes, that’s another good one. Remember that the state of an open (RW) file is a hybrid of in-memory and in-storage bytes. At this point (out-of-space), the in-memory space is getting out-of-sync in an, for the moment, incorrigible way. You’d ask the library to keep track of change history and then decide for you which ones to jettison/roll back when matters get out of hand. I’m not saying this is an impossible task, but not currently supported.

Yes.

By doing the HDF5 counterpart of what fsck does after an unexpected shutdown (without the benefit of a transaction log). Unfortunately, I can’t point you to a tool that would do that for you.

G.

Thanks for the help, I tried a bit more but I believe it’s better to leave it for now.

Actually I found an other problem on a different file. The code, that generates the file, gives no error, It stops naturally for time limit but when I try to read the file the error is similar.
I tried various command from the tools and there are the results:

h5dump file:
h5dump error: internal error (file …/…/…/…/…/tools/src/h5dump/h5dump.c:line 1487)

h5debug file 136 680: (respectively Btree and Heap addresses)
Reading signature at address 136 (rel)
An error occurred!
HDF5-DIAG: Error detected in HDF5 (1.10.7) thread 1:
#000: …/…/…/src/H5Gnode.c line 1491 in H5G_node_debug(): unable to protect symbol table heap
major: Symbol table
minor: Unable to load metadata into cache
#001: …/…/…/src/H5HL.c line 364 in H5HL_protect(): unable to load heap data block
major: Heap
minor: Unable to protect metadata
#002: …/…/…/src/H5AC.c line 1517 in H5AC_protect(): H5C_protect() failed
major: Object cache
minor: Unable to protect metadata
#003: …/…/…/src/H5C.c line 2501 in H5C_protect(): can’t load entry
major: Object cache
minor: Unable to load metadata into cache
#004: …/…/…/src/H5C.c line 7661 in H5C_load_entry(): Can’t deserialize image
major: Object cache
minor: Unable to load metadata into cache
#005: …/…/…/src/H5HLcache.c line 764 in H5HL__cache_datablock_deserialize(): can’t initialize free list
major: Heap
minor: Unable to initialize object
#006: …/…/…/src/H5HLcache.c line 260 in H5HL__fl_deserialize(): bad heap free list
major: Heap
minor: Out of range

h5check file:

VALIDATING illinoisgrmhd-grmhd_primitives_allbutbi.xy.h5 according to library version 1.8.0

***Error***
Local Heap:Bad heap free list at addr 704
***End of Error messages***
Non-compliance errors found

h5stat file:
h5stat error: unable to traverse objects/links in file “illinoisgrmhd-grmhd_primitives_allbutbi.xy.h5”