I'm having some issues with file locking on a parallel filesystem which I
believe is related the problem here:
https://lists.hdfgroup.org/pipermail/hdf-forum_lists.hdfgroup.org/2011-February/004254.html
I tried the suggestion of disabling romio_ds_read and romio_ds_write, but
this doesn't fix the problem. I've also tried setting H5Pset_seive_buf_size
to 0 to disable data striding in HDF5 itself, but that doesn't work, either.
Here's a short snippet of the relevant code:
hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "romio_ds_read", "disable");
MPI_Info_set(info, "romio_ds_write", "disable");
H5Pset_sieve_buf_size(fapl_id, 0);
H5Pset_fapl_mpio(fapl_id, comm, info);
hid_t f_id = H5Fcreate("test_file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
It's on this last line that the program hangs, and eventually MPI_ABORT
gets called. Note that I get the same problem regardless of if I try to
create a new file, or if I try to open an existing file:
hid_t f_id = H5Fopen("test_file.h5", H5F_ACC_RDWR, fapl_id);
Other relevant information:
1. Filesystem is NFS v3 with noac off (can't change)
2. Tested with HDF 5.8.13 and 5.8.14
3. Tested with OpenMPI 1.6.5 and 1.10.2
What am I missing? I saw that there's an option to disable filesystem
atomicity, but this requires that the file already be opened, and I can't
even get that far.
I also know that my code does work on different computers with different
filesystems and/or mount options.