Hello HDF5 Community,
I’m encountering an issue with reading large HDF5 files (500GB+) on a multi-node HPC system. Specifically, when using parallel I/O with MPI, I’ve noticed:
- Slow Read Performance: The read times are significantly slower than expected, even with optimized chunking and collective I/O enabled.
-
Occasional Errors: Intermittent
H5Dread()
errors occur when accessing datasets with millions of rows.
Here’s my setup:
- HDF5 Version: 1.12.2
- MPI Library: OpenMPI 4.1.5
- File System: Lustre
I’ve tried tuning H5Pset_dxpl_mpio()
parameters and adjusting chunk sizes, but the improvements are minimal. Could this be an issue with Lustre striping or MPI-IO settings?
Any advice or insights would be greatly appreciated!
Thank you,