ROS3 in combination with MPI


Lately, I am testing the ROS3 functionality of the HDF5 library. So far for the serial HDF5 library everything works as expected. Now, I tried the following combination: parallel HDF5 with ROS3 support. If I set first ROS3 using H5Pset_fapl_ros3() followed by setting MPIIO (5Pset_fapl_mpio()) I got several HDF5 errors. However, if I set first MPIIO followed by ROS3 I do not get any errors, but the retrieved data does not make any sense. I also did a test without using 5Pset_fapl_mpio(). Again, no errors but unfortunately the retrieved data do not look right to me.

So my question. Should this combination of ROS3 + MPI work? If so, is there an example how to use it.

Best regards,

@jan-willem.blokland both VFDs (ROS3 and MPIIO) are terminal, i.e., they talk to storage, and cannot be “stacked.” Also, the parallel HDF5 library is not a superset of the sequential HDF5 library. What should work just fine is to use a sequential version of the library w/ ROS3 in an MPI job and do only H5Pset_fapl_ros3 in each rank. Using a parallel build w/ ROS3 enabled might work, but I don’t know for sure.

Best, G.

@gheber Thanks for your explanation. With this knowledge my observation makes sense now. From my small test, it seems like that parallel build w/ ROS3 also does not work properly. For this case the retrieved data is not what I expected to be. Furthermore, thanks for your suggestion of using sequential version w/ ROS3 in an MPI job. Depending on how our workflows look like, it is clearly an useful option to consider. Slowly, I start to realize maybe it is better to make use of HSDS.

Maybe an idea to adjust the autotools and CMake build system such that it is not possible to build parallal library with ROS3 option enabled. Or least not without overriding the defaults as can be done for building parallel library with C++ API.

My colleague @mlarson has done fantastic work w/ the REST-VOL (which is read/write!), and we’ll hear more at HUG’23 about that.