I am experiencing an issue when writing chunked datasets in HDF5 using MPICH2. When the dataset size is small, everything works correctly. However, when the dataset size exceeds approximately 40GB, I encounter the following error:
Assertion failed in file adio/common/ad_write_coll.c at line 877:
(curr_to_proc[p] + len - done_to_proc[p]) == (unsigned) (curr_to_proc[p] + len - done_to_proc[p])
Abort(1) on node 43: Internal error
HDF5: infinite loop closing library
L,S_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top
MPICH2 v.4.2.1 and HDF5 v1.14.6 are being used, but the issue does not occur when using OpenMPI or Intel MPI, and the writing process completes successfully. Has this issue been encountered before, or is there a known solution?