Good morning everyone,
we’re attempting to create a largely read-only file containing many nested groups and datasets. Profiling the reading shows that 40 to 60 percent of the runtime is spent in H5Gopen
. Therefore, we would like to investigate if a split file, i.e. H5Pset_fapl_split
, leads to better performance.
In order to create the file in a timely manner we want to first allocate all datasets sequentially and then fill the datasets using MPI I/O.
When attempting to do so, we found that the application deadlocks when opening the file a second time. The relevant parts of the reproducer are:
if(comm_rank == 0) {
auto fcpl = H5P_DEFAULT;
auto fapl = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_libver_bounds(fapl, H5F_LIBVER_V110, H5F_LIBVER_V110);
H5Pset_fapl_split(
fapl,
".meta", H5P_DEFAULT,
".raw", H5P_DEFAULT
);
hid_t fid = H5Fcreate(filename.c_str(), H5F_ACC_EXCL, H5P_DEFAULT, fapl);
// Allocate datasets here.
H5Pclose(fapl);
H5Fclose(fid);
}
MPI_Barrier(comm);
{
auto fapl = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_libver_bounds(fapl, H5F_LIBVER_V110, H5F_LIBVER_V110);
auto mpio_fapl = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio(mpio_fapl, comm, MPI_INFO_NULL);
H5Pset_fapl_split(fapl, ".meta", mpio_fapl, ".raw", mpio_fapl);
// This line deadlocks.
hid_t fid = H5Fopen(filename.c_str(), H5F_ACC_RDWR, fapl);
// Fill the datasets.
H5Pclose(mpio_fapl);
H5Pclose(fapl);
H5Fclose(fid);
}
The link to a full reproducer including backtraces can be found at the end of this post.
The question are:
- Does split file work with MPI-IO?
- Is there anything obviously wrong in the way we’re trying to use MPI-IO with split files?
Many thanks for your attention.