Hi,
I posted this question on the h5py forum initially, but it was suggested to me
that this would be a more appropriate venue. I have included the original question,
traceback, and version numbers. If anything else is needed, please let me know and
I will be happy to provide it.
I am running a program on a cluster, which is using h5py to read from a file.
A regionref is used to index to a dataset in a file and then read the region of interest.
It is worth noting that there are many jobs (i.e. ~60 processes) running on the cluster that may be reading
from the same file at the same time. However, there is no writing occurring while these jobs are reading.
It is also worth noting that the dataset is quite large (i.e. ~3000x500x500 of uint16),
but the data pointed to by each regionref is not (i.e. ~100x100x100 of uint16).
Normally, this works fine. However, I am getting an intermittent exception.
The relevant snippet from the traceback can be seen below.
I am no expert on HDF5, but I took a brief glance into the C code.
It appears that some memory has been tampered with, which causes this exception.
If anybody could help, I would be very grateful.
Best,
John
P.S. - I should add that if I rerun any of the failing cases without changing anything, they will complete.
Traceback:
File "/groups/flyem/home/kirkhamj/nanshe-package/src/nanshe-git/nanshe/HDF5_serializers.py", line 112, in read_numpy_structured_array_from_HDF5
data = external_file_handle[data_ref][data_ref]
File "/groups/flyem/home/kirkhamj/nanshe-package/lib/python2.7/site-packages/h5py-2.3.1-py2.7-linux-x86_64.egg/h5py/_hl/group.py", line 149, in __getitem__
File "h5r.pyx", line 75, in h5py.h5r.dereference (h5py/h5r.c:1289)
ValueError: Unable to dereference object (Bad global heap collection signature)
OS: Scientific Linux 6.3
Kernel Version:
[kirkhamj@h03u24 ~]$ uname -r
2.6.32-279.el6.x86_64
h5py and HDF5 Versions:
In [1]: import h5py
In [2]: h5py.__version__
Out[2]: '2.3.1'
In [3]: h5py.version.hdf5_version
Out[3]: '1.8.13'