Writing Large Chunked Data in Parallel using MPICH2 and HDF5

I am experiencing an issue when writing chunked datasets in HDF5 using MPICH2. When the dataset size is small, everything works correctly. However, when the dataset size exceeds approximately 40GB, I encounter the following error:
Assertion failed in file adio/common/ad_write_coll.c at line 877:
(curr_to_proc[p] + len - done_to_proc[p]) == (unsigned) (curr_to_proc[p] + len - done_to_proc[p])
Abort(1) on node 43: Internal error

HDF5: infinite loop closing library
L,S_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top
MPICH2 v.4.2.1 and HDF5 v1.14.6 are being used, but the issue does not occur when using OpenMPI or Intel MPI, and the writing process completes successfully. Has this issue been encountered before, or is there a known solution?

We think this might be an MPICH issue. Could you please try with MPICH2 v. 5?