Unable to open a Dataset in my HDF5 file

Hello,

I am using google collaboratory to open the .h5 file using the h5py library. The attached screenshot contains the commands that I used to get the details of the h5 file. When I am trying to convert the dataset in hdf5 to NumPy array, using the command “phrase_numpy = np.array(phrase)”, where “phrase” is the dataset in hdf5, I am getting an OS error as attached. Could someone help me with resolving this issue?

Best regards,
Sai

(h5file_detailsh5file_details

)

Halfway through, the error message reads:

OSError: Can't read data (can't open directory : /usr/local/hdf5/lib/plugin)

Perhaps your dataset phrases depends on some kind of filter (compression?) and /usr/local/hdf5/lib/plugin is the default location where such filters might be found. What does

print(phrase.compression)

return?

G.

It’s returning me none.

phrase_compression

OK, that means that the dataset is not compressed. I’m not sure why that would trigger the exception. (Why look for plugins that we don’t need?) It could be an error in the way the environment (“google collaboratory”) is set up. How big is your file? Can you try running your notebook in HDF Lab?

There’re other exceptions farther down:

AttributeError: `OSError` object has no attribute `_render_traceback_`

and

OSError: [Error 107] Transport endpoint is not connected

I’m not sure what that is about. These errors have nothing to do with HDF5 or h5py.

I think trying HDF Lab is your best bet, at least we can help your with issues you might run into.

Best, G.

The correct way to get HDF5 dataset’s data as a NumPy array is: phrase_numpy = phrase[...].

Aleksandar