Fairly new to HDF5. Currently implementing a method to parse data in parallel. Ran into a question that I haven’t seen directly answered anywhere in documentation or examples yet.
Some of the datasets in the file are written as contiguous blocks by rank (distributed mesh data). Each rank knows which subset of the dataset belongs to them. This is very straight forward to implement in the API using the hyperslab interface.
However other datasets need to be read in their entirety by every rank (shared data that is stored redundantly across all ranks). What is the best practice to ensure best performance in this scenario? Is this something that parallel HDF5 has a solution for? Or is this an exercise left to the reader?
The library doesn’t seem to like it when I attempt to read H5S_ALL dataset spaces with a H5FD_MPIO_COLLECTIVE property. I get “H5D__read(): collective access for MPI-based drivers only” errors.