H5DOpen collective, driver MPIO

Hi all,

I’m trying to write a hdf5 dataset opened only by one process. This process does not modify the dataset metadata and only wants to write the data via write_direct:

ds = group[ds_name]
ds.write_direct(img_data)

However, the first line (opening dataset) seems to be collective - the actual collective call happens on line 288 in file h5py/_hl/group.py:

oid = h5o.open(self.id, self._e(name), lapl=self._lapl)

According to the documentation (https://support.hdfgroup.org/HDF5/doc/RM/CollectiveCalls.html) the call doesn’t need to be collective if the dataset is not modified, which is confirmed in this thread Collective H5Dopen. I’m thinking that it would not make sense if that was not the case and all dataset operations would need to be collective.

Is this a bug or am I using the h5py API wrong?

Thanks very much!

Cheers,

Jiri

Hi all, since no reaction here, I tried to dig into it myself and verified on the C library that the call indeed is not collective there:

plist_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio(plist_id, comm, info);

file_id = H5Fopen(H5FILE_NAME, H5P_DEFAULT, plist_id);
if (mpi_rank == 0) {
    dset_id = H5Oopen(file_id, DATASETNAME, H5P_DEFAULT);
}

indeed does finish normally when run with multiple processes. It doesn’t matter that only rank 0 does call the H5Oopen function.

Since the method:

h5o.open(self.id, self._e(name), lapl=self._lapl)

should only be a cython wrapper around the C method which I tested above, the only logical explanation I can think of is that it does not use the H5P_DEFAULT property list and the self._lapl has some properties that force the collective behavior of the h5o.open.

I also found within the base.py what the default configuration is:

def default_lapl():
    """ Default link access property list """
    lapl = h5p.create(h5p.LINK_ACCESS)
    fapl = h5p.create(h5p.FILE_ACCESS)
    fapl.set_fclose_degree(h5f.CLOSE_STRONG)
    lapl.set_elink_fapl(fapl)
    return lapl

but I cannot find any suspicious property there. Can I somehow print the whole property list to compare the property list from C library and h5py?

Another thing that came to my mind is that I am using the track_order=True for my file structure, maybe that could also force the operations to be collective?
Edit: On second thought, the track_order is irrelevant as the C program mentioned above is working with the same H5 file and opening the dataset works independently. That brings me again to the only possible cause - the properties.

Anyway, I am already at the limits of what I am able to debug, please help.

Cheers,

Jiri

Is your dataset laid out contiguously?

are not there in C. I can’t remember what the default is for the fclose_degree, but it’s certainly not H5F_CLOSE_STRONG. I believe it’s H5F_CLOSE_WEAK. The link access property list is also just plain vanilla in C.

Have you tried your example with just the default POSIX driver, i.e., rank 0 opening the file, and doing write_direct?

G.

Hi, thanks very much for the reply!

My dataset is chunked but it is arranged the way that always only one writer will be writing it.

Writing the file with the default driver works completely fine - I have in fact an existing solution that I am now parallelizing. For the sake of this problem, we can simplify it so that one writer takes one image, does some preprocessing on it, and writes it into a dataset in HDF5.

All of the metadata preparation and dataset creation is done by rank 0 before so the writer only finds the corresponding dataset and writes the data. File is closed and reopened after the metadata operations by all writers in driver=‘mpio’ mode.

Important thing is that opening a group works independently while opening a dataset does not.

I will try to adjust the fclose_degree property if that changes anything.

Cheers,

Jiri

The chunked layout could be part of the problem. The metadata is very different in that case, and for the library to determine that it doesn’t change is non-trivial. Could you try your experiment with a contiguous layout?

Yep, good thinking - when I switch to the contiguous dataset, the operation is indeed not collective and I am able to write_direct independently as well. By the way, I tried to play with the fclose_degree and that doesn’t work.

So then this boils down to what does h5py do differently than the C library underneath? In the end, it seems that it does not allow independent writes to chunked datasets.

Thank you,

Jiri