H5Iget_file_id interferes with collective write?


#1

Hi all,

I’ve been working on implementing instrumentation wrappers for HDF5 (standard wrapper library technique) and have encountered some strange behaviour with the ph5example.c sample. Our wrappers record trace records at function entry and exit, and in the event that the function being wrapped performs an I/O operation, they record that as well; this means we’re calling H5Iget_file_id inside these wrappers to find the appropriate file handle that goes with a H5Dread/write operation. Here’s the behaviour I’ve seen:

  • If H5Iget_file_id is called in the wrapper to H5Dwrite, and a single run of ph5example performs both a collective write and a collective read, when we open the file for collective reading it has an invalid superblock.
  • If we interpose an H5close/H5open pair between the collective write and collective read, everything works fine.
  • Non-collective read/write do not have this problem.
  • Replacing H5Iget_file_id with a simple cache mapping data set ids to file ids also works, though this is obviously not a great solution.

The calls to H5Iget_file_id should be safe according to the documentation–they’re by definition as collective as the calls to H5Dwrite that they’re in the wrappers for. Is this a bug or am I doing something that’s unsafe by design somehow?

–Bill


#2

For a simple reproducer, you can add a call to H5Iget_file_id(dataset1) immediately before the H5Dwrite call on line 590 of ph5example.c as follows:

    /* write data collectively */
    (void)H5Iget_file_id(dataset1);
    ret = H5Dwrite(dataset1, H5T_NATIVE_INT, mem_dataspace, file_dataspace,
	    xfer_plist, data_array1);
    assert(ret != FAIL);
    MESG("H5Dwrite succeed");

This fails for me with the following output:

HDF5-DIAG: Error detected in HDF5 (1.11.4) MPI-process 0:
  #000: /home/wwilliam/hdf5/src/H5F.c line 752 in H5Fopen(): unable to open file
    major: File accessibility
    minor: Unable to open file
  #001: /home/wwilliam/hdf5/src/H5VLcallback.c line 2765 in H5VL_file_open(): open failed
    major: Virtual Object Layer
    minor: Can't open object
  #002: /home/wwilliam/hdf5/src/H5VLcallback.c line 2730 in H5VL__file_open(): open failed
    major: Virtual Object Layer
    minor: Can't open object
  #003: /home/wwilliam/hdf5/src/H5VLnative_file.c line 100 in H5VL__native_file_open(): unable to open file
    major: File accessibility
    minor: Unable to open file
  #004: /home/wwilliam/hdf5/src/H5Fint.c line 1701 in H5F_open(): unable to read superblock
    major: File accessibility
    minor: Read failed
  #005: /home/wwilliam/hdf5/src/H5Fsuper.c line 411 in H5F__super_read(): file signature not found
    major: File accessibility
    minor: Not an HDF5 file
ph5example: /home/h4/wwilliam/scorep/TRY_TUD_io_recording/test/io/hdf5/ph5example.c:722: phdf5readAll: Assertion `fid1 != -1' failed.


#3

@dax.rodriguez Following up from our conversation at ISC19.
It is not immediately clear to me whether this problem is identical to the one my colleague @harryherold reported here: PHDF5: Inconsistent HDF5 file.. The end result of a HDF5-written file being reported as inconsistent when reopened is the same, and I believe what I have is the simplest reproducer of that problem, but there may be other factors at work in his reproducer.

If anyone is aware of a workaround that will allow us to call H5Iget_file_id safely at all times on versions 1.08/1.10, or can define what assumptions are safe to make about caching file IDs in these versions, that would also be helpful–as a tool developer, we don’t get to control our users’ use of the HDF5 API.


#4

Do you happened to have a minimal compileable example by any chance? I am running throughput test on various HDF5 versions with openMPI 4.01 + PMIX lastest release + ubuntu 18.04 custom kernel + OrangeFS 2.9 + SLURM on AWS EC2 based cluster.

If you provide the full example then I could run it for you, and post the result. As the developer of H5CPP (parallel/serial edition) I have interest in confirming any unusual behaviour.

best wishes:
steven


#5

I have not pared this down to a true MCE yet and am traveling this week, so will have limited time to work on that. If the patched ph5example.c is helpful, it’s here: ph5example.c (34.1 KB). Obviously that can be stripped down somewhat as follows:

It needs to be run with parallel/collective write and read in the same run, so the other code paths can probably be treated as dead/irrelevant. There’s certainly not much being written to the example file either in size or complexity, but it may be possible to pare that down to a single int or even an empty file. Everything after the initial open along the read path is unreachable due to the failed consistency check, so that can be treated as irrelevant.


#6

Hi William,

I ran the ph5example.c you provided and threw an error for me as well, see attached output and please confirm if matches with yours. At the current standing I can’t confirm anything, because I am having other – not HDF5 related issues – as well. Also the test is ‘not minimal’. On the bright side H5CPP does rely on H5I_xxx calls mostly for reference counting, and those seem to work well for me.

I suggest to get back on this once your trip is over? – until then I can fix the not-related issues, and we could confirm if this is a bug or something else. For that I have a working set-up to compile a software with near all versions of HDF5 back to 2003, so if the problem is confirmed with one version, we could explore if exists with other versions or not.
test.txt (8.2 KB)

ps: Did we happened to meet at ISC’19 in person, and had a brief conversation?

cheers: steven


#7

@steven Almost certainly we did talk at ISC; I am usually better with names and would have tagged you directly had I gotten your card.

I’ve stripped the ph5 example down quite a bit: nothing spurious after the read, no unnecessary code for the non-collective options, and a single 2x2 write in collective mode using OpenMPI. The output is unchanged and yours matches mine as well.

I don’t think we can get smaller reasonably–if there’s no dataset to write collectively, there’s no collective mode write, right?

minph5example.c (15.8 KB)