Hello,
I create a HDF file in parallel using h5py. It basically generates time series and each time series is 2048 floats. If I generate a small file (~80MB) everything works fine. If I create a large file (~8GB) I can’t open it.
Now I have other scripts that also use h5py (but not in parallel) that generate files of size ~80GB and I can open those, so it’s not as if h5dump has any issue with big files.
if I run h5dump -H foo.hdf
I get h5dump error: unable to open file foo.hdf
which really doesn’t help me. How do I figure out why it can’t open the file?
Also here’s a little bit about my code. I basically have a master/slave approach. My master rank generates some data that’s passed to a slave which then takes that data to generate a bunch of time series.
Each slaves opens the file using:
with FileManager(filename, slaves, N_noise, N_signal, N_samples) as file:
generator = TimeSeriesGenerator(file)
....
whereas the FileManager opens the file in the init method and leaves closing it to the with
above.
I now tried it with hdfview (but it’s over ssh -X if that matters) and I get
java.io.IOException: Unsupported fileformat
Edit: Note that I create a communicator for the slaves and all of them open the file. The master, which only resides in the world communicator, doesn’t interact/open the file. I also can’t see how scaling up the amount of data generated should lead to an issue with the actual file. It really does the same, just more of it. But in any case, I would like to get more information as to what’s wrong with the file.
Edit: It seems I simply got unlucky and something went wrong in the last generation run. I redid it and it seems to be fine now. I can’t explain it to myself since I never saw any error but I guess I maybe simply got unlucky?