Hello!
I have a problem with a HDF5 file created with h5py. I have the suspicion that the process writing to the file was killed.
Problem
I have a HDF5 file created with h5py. Unfortunately I cannot read the data within because I get several errors when trying to access the data with their path:
- Unable to open object (address of object past end of allocation)
- Unable to get group info (addr overflow, addr = 44103764, size = 544, eoa = 2048)
Additional Info
I suspect the process writing to the file was killed (thus the file not properly closed).
System: Ubuntu 20.04
hdf5-utils info:
$ h5dump file.hd5
h5dump error: internal error (file …/…/…/…/…/tools/src/h5dump/h5dump.c:line 1493)
$ h5debug file.hd5
Reading signature at address 0 (rel)
File Super Block…
File name (as opened): file.hd5
File name (after resolving symlinks): file.hd5
File access flags 0x00000000
File open reference count: 1
Address of super block: 0 (abs)
Size of userblock: 0 bytes
Superblock version number: 0
Free list version number: 0
Root group symbol table entry version number: 0
Shared header version number: 0
Size of file offsets (haddr_t type): 8 bytes
Size of file lengths (hsize_t type): 8 bytes
Symbol table leaf node 1/2 rank: 4
Symbol table internal node 1/2 rank: 16
Indexed storage internal node 1/2 rank: 32
File status flags: 0x01
Superblock extension address: UNDEF (rel)
Shared object header message table address: UNDEF (rel)
Shared object header message version number: 0
Number of shared object header message indexes: 0
Address of driver information block: UNDEF (rel)
Root group symbol table entry:
Name offset into private heap: 0
Object header address: 96
Cache info type: Symbol Table
Cached entry information:
B-tree address: 136
Heap address: 680
Attempt
So I extended eoa with: h5clear --increment file.hd5
Now I get the following error (only this one).
Link iteration failed (bad local heap signature)
Reconstructing
Is it possible to reconstruct the data in the file?
I know exactly what the single datasets are. (size and type).
Is it possible to index the datasets differently than with their name. Maybe with the memory offset from their parent.
I would greatly appreciate any help. Thank you!