Crash when writing parallel compressed chunks

Just an observation from a run compiled with 1.10.6 + the patch provided by @jhenderson earlier in this discussion chain.

When the 2nd H5Dcreate call is moved to before the 1st H5Dwrite in @jrichardshaw’s test program, the error was gone. It looks like the problem occurs when calls to H5Dcreate and H5Dwrite are interleaved.

After reading into file H5C.c and adding a few printf statements, it appears that values of entry_ptr->coll_access checked in line 2271 are not consistent among the 4 running processes, which causes only 2 of the 4 processes calling MPI_Bcast at line 2297, and thus the error.

Just to follow up. This recent set of parameters in the Gist also fails on HDF5 1.12.0 (a separate discussion with some HDF5 staff suggested it might be fixed and got my hopes up). Crash output (although I think it’s largely the same):

MPI rank [0/4]
rank=0 creating file
MPI rank [1/4]
rank=1 creating file
MPI rank [2/4]
rank=2 creating file
MPI rank [3/4]
rank=3 creating file
rank=0 creating selection [0:4, 0:4194304]
rank=0 creating dataset1
rank=1 creating selection [4:8, 0:4194304]
rank=1 creating dataset1
rank=2 creating selection [8:12, 0:4194304]
rank=2 creating dataset1
rank=3 creating selection [12:16, 0:4194304]
rank=3 creating dataset1
rank=1 writing dataset1
rank=2 writing dataset1
rank=0 writing dataset1
rank=3 writing dataset1
rank=3 finished writing dataset1
rank=3 waiting at barrier
rank=0 finished writing dataset1
rank=0 waiting at barrier
rank=1 finished writing dataset1
rank=1 waiting at barrier
rank=0 creating dataset2
rank=2 finished writing dataset1
rank=2 waiting at barrier
rank=2 creating dataset2
rank=3 creating dataset2
rank=1 creating dataset2
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-processHDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  # 2:
  #000: H5D.c line 151 in H5Dcreate2(): unable to create dataset
    major: 000: H5D.c line 151 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5VLcallback.c line 1869 in H5VL_dataset_create(): dataset create failed
    major:Dataset
    minor: Unable to initialize object
  #001: H5VLcallback.c line 1869 in H5VL_dataset_create(): dataset create failed
    major: Virtual Object Layer
    minor: Unable to create file
  # Virtual Object Layer
    minor: Unable to create file
  #002: H5VLcallback.c line 1835 in H5VL__dataset_create(): dataset create failed
    ma002: H5VLcallback.c line 1835 in H5VL__dataset_create(): dataset create failed
    major: Virtual Object Layer
    minor: Unable to create file
  #003: H5VLnative_dataset.c line jor: Virtual Object Layer
    minor: Unable to create file
  #003: H5VLnative_dataset.c line 75 in H5VL__native_dataset_create(): unable to create dataset
    major:75 in H5VL__native_dataset_create(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #004 Dataset
    minor: Unable to initialize object
  #004: H5Dint.c line 411 in H5D__create_named: H5Dint.c line 411 in H5D__create_named(): unable to create and link
 to dataset
    major: Dataset
    m(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #inor: Unable to initialize object
  #005: H5L.c line 1804 in H5L_link_object(): 005: H5L.c line 1804 in H5L_link_object(): unable to create new link
to object
    major: unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #006: H5L.c lLinks
    minor: Unable to initialize object
  #006: H5L.c line 2045 in H5L__create_realine 2045 in H5L__create_real(): can't insert link
    major: Links
    minor(): can't insert link
    major: Links
    minor: Unable to insert object
  #007: Unable to insert object
  #007: H5Gtraverse.c line 855 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: H5Gtraverse.c line 855 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #008: : Object not found
  #008: H5Gtraverse.c line 585 in H5G__traverse_real(): can't look up component
    major:H5Gtraverse.c line 585 in H5G__traverse_real(): can't look up component
    major: Symbol table
    minor: Object not found
  #009:  Symbol table
    minor: Object not found
  #009: H5Gobj.c line 1125 in H5G__obj_lookup(): can't check for link info message
    majorH5Gobj.c line 1125 in H5G__obj_lookup(): can't check for link info message
    major: Symbol table
    min: Symbol table
    minor: Can't get value
  #010: H5Gobj.c line 326 in H5G__obj_get_linfoor: Can't get value
  #010: H5Gobj.c line 326 in H5G__obj_get_linfo(): (): unable to read object header
    major: Symbol table
    minor: Can't get value
  #unable to read object header
    major: Symbol table
    minor: Can't get value
  #011: 011: H5Omessage.c line 883 in H5O_msg_exists(): unable to protect object header
    major: H5Omessage.c line 883 in H5O_msg_exists(): unable to protect object header
    major: Object header
Object header
    minor: Unable to protect metadata
  #012: H5Oint.c line 1082 i    minor: Unable to protect metadata
  #012: H5Oint.c line 1082 in n H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
  #  #013: H5AC.c line 1312 in H5AC_protect(): H5C_protect() failed
    majo013: H5AC.c line 1312 in H5AC_protect(): H5C_protect() failed
    major: r: Object cache
    minor: Unable to protect metadata
  #014: H5C.c line 2299 Object cache
    minor: Unable to protect metadata
  #014: H5C.c line 2299 inin H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed

  #015: H5C.c line 2299 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    maj#015: H5C.c line 2299 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    major: or: Internal error (too specific to document in detail)
    minor: MPI Error String
rank=2 writing dataset2
rank=3 writing dataset2
Internal error (too specific to document in detail)
    minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 2:
  #000: H5Dio.c line 300 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 2:
  #000: H5D.c line 332 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=2 closing everything
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  #000: H5Dio.c line 300 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  #000: H5D.c line 332 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=3 closing everything

Hi @jrichardshaw,

unfortunately there hasn’t been much time to look at this. However, we do know of some other folks that are looking for a fix to this issue as well. Based on @wkliao’s observation, I’m fairly certain that it’s just a problem of needing to insert barriers in the appropriate place in the library’s code. I remember having an issue reproducing this using your example, so I wasn’t quite able to determine if this really was the source of the issue, but I’m thinking that running several rounds of

H5Dcreate(...);
H5Dwrite(...);

should eventually produce the issue for me. In any case, I believe there should be more info on this issue relatively soon.

Hi again @jrichardshaw, @wkliao and others in this thread. I’ve narrowed down the cause of this issue and will have a small patch to post after I’ve discussed the fix with other developers. Provided that that patch works here and doesn’t cause further issues, we should be able to get the fix in quickly afterwards.

Wonderful. Thanks @jhenderson! I’ll be happy to test the patch whenever you post it.

Hi @jrichardshaw and @wkliao,

attached is a small patch against the 1.12 branch that temporarily disables the collective metadata reads feature in HDF5, which should make the issue disappear for now. However, this is only a temporary fix and may potentially affect performance. The issue stems from an oversight in the design of the collective metadata reads feature that has effectively been masked until recently and it will need to be fixed. While this feature wasn’t specifically enabled in your test program, there are some cases in the library where we implicitly turn the feature on due to metadata modifications needing to be collective, such as for H5Dcreate. That behavior, combined with your chosen chunk size and number of chunks was right on the line needed to cause the issue to appear. The timeline on fixing this correctly isn’t clear yet, but we hope to be able to fix this in time for the next release of HDF5.

disable_coll_md_reads.patch (480 Bytes)

1 Like

Thanks for the path @jhenderson. We’ve been testing the patch but we’re still having failures. One of my colleagues has posted a fuller description (post is awaiting approval), but what we’re finding is that it works will the nominal test case above, but if we go back to the first set of parameters (CHUNK1=32768; NCHUNK1=32), that it hangs. This seems more similar to the first issue found in this thread.

Anyway, I think my colleagues pending post has more details (including stack traces), so I won’t try and repeat them here.

1 Like

Thanks for the latest patch @jhenderson
I applied it to both the HEAD of the hdf5_1_12 branch as well as the tag hdf5-1_12_0.
Unfortunately the minimal test supplied by @jrichardshaw still hangs if built against these two if I uncomment

// Equivalent to original gist
// Works on 1.10.5 with patch, crashes on 1.10.5 vanilla and hangs on 1.10.6
#define CHUNK1 32768
#define NCHUNK1 32

This is the stack trace I got using tmpi 4 gdb ./testh5:

#0  0x00002aaaab49e9a7 in PMPI_Type_size_x ()                                                                                               │#0  0x00002aaaab49e994 in PMPI_Type_size_x ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#1  0x00002aaaab52d0f3 in ADIOI_GEN_WriteContig ()                                                                                          │#1  0x00002aaaab52d0f3 in ADIOI_GEN_WriteContig ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#2  0x00002aaaab531323 in ADIOI_GEN_WriteStrided ()                                                                                         │#2  0x00002aaaab531323 in ADIOI_GEN_WriteStrided ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#3  0x00002aaaab52faab in ADIOI_GEN_WriteStridedColl ()                                                                                     │#3  0x00002aaaab52faab in ADIOI_GEN_WriteStridedColl ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#4  0x00002aaaab544fac in MPIOI_File_write_all ()                                                                                           │#4  0x00002aaaab544fac in MPIOI_File_write_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#5  0x00002aaaab545531 in mca_io_romio_dist_MPI_File_write_at_all ()                                                                        │#5  0x00002aaaab545531 in mca_io_romio_dist_MPI_File_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#6  0x00002aaaab514922 in mca_io_romio321_file_write_at_all ()                                                                              │#6  0x00002aaaab514922 in mca_io_romio321_file_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#7  0x00002aaaab4848a8 in PMPI_File_write_at_all ()                                                                                         │#7  0x00002aaaab4848a8 in PMPI_File_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#8  0x000000000073d5a5 in H5FD__mpio_write (_file=0xceec90, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=3688, size=<optimized out>,   │#8  0x000000000073d5a5 in H5FD__mpio_write (_file=0xceec30, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=16780904, 
    buf=0x2aaaba5fb010) at H5FDmpio.c:1466                                                                                                  │    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1466
#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceec90, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688, size=size@entry=1,     │#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceec30, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, size=size@entry=1, 
    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248                                                                                          │    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248
#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf02d0, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688,          │#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf02d0, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, 
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826                                                                      │    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826
#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688, size=size@entry=1,     │#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, size=size@entry=1, 
    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031                                                                                            │    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031
#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=3688, size=size@entry=1,               │#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=16780904, size=size@entry=1, 
    buf=0x2aaaba5fb010) at H5Fio.c:205                                                                                                      │    buf=0x2aaaba5fb010) at H5Fio.c:205
#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1,                       │#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1, 
    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490                                                                 │    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490
#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd70760, mpi_buf_type=0xd717a0) at H5Dmpio.c:2124                                   │    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f8d0, mpi_buf_type=0xd70910) at H5Dmpio.c:2124
#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,    │#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    fm=fm@entry=0xd110c0, sum_chunk=<optimized out>) at H5Dmpio.c:1234                                                                      │    fm=fm@entry=0xd10800, sum_chunk=<optimized out>) at H5Dmpio.c:1234
#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    fm=fm@entry=0xd110c0) at H5Dmpio.c:883                                                                                                  │    fm=fm@entry=0xd10800) at H5Dmpio.c:883
#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>,            │#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>, 
    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd110c0) at H5Dmpio.c:960                                                    │    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10800) at H5Dmpio.c:960
#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4db0, mem_type_id=mem_type_id@entry=216172782113783850,                     │#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf46e0, mem_type_id=mem_type_id@entry=216172782113783850, mem_space=0xce4fd0, 
    mem_space=0xce5050, file_space=0xce2f40, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780                                  │    file_space=0xce2ec0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780
#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4db0, mem_type_id=216172782113783850, mem_space_id=288230376151711748,        │#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf46e0, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206                     │    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206
#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4db0, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850,                │#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf46e0, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151                                           │    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151
#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf4c50, mem_type_id=mem_type_id@entry=216172782113783850,             │#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf4580, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185                                 │    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185
#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748,               │#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313                                        │    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313
#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98                                                               │#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#0  0x00002aaab8220b46 in psm2_mq_ipeek2 () from /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib/libpsm2.so.2                   │#0  0x00002aaaabf92de8 in opal_progress ()
#1  0x00002aaab8002409 in ompi_mtl_psm2_progress ()                                                                                         │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libopen-pal.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/openmpi/mca_mtl_psm2.so                   │#1  0x00002aaaab45f435 in ompi_request_default_wait ()
#2  0x00002aaaabf92e0b in opal_progress ()                                                                                                  │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libopen-pal.so.40                         │#2  0x00002aaaab4bf303 in ompi_coll_base_sendrecv_actual ()
#3  0x00002aaaab45f435 in ompi_request_default_wait ()                                                                                      │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#3  0x00002aaaab4bf739 in ompi_coll_base_allreduce_intra_recursivedoubling ()
#4  0x00002aaaab4bf303 in ompi_coll_base_sendrecv_actual ()                                                                                 │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#4  0x00002aaaab4735b8 in PMPI_Allreduce ()
#5  0x00002aaaab4bf739 in ompi_coll_base_allreduce_intra_recursivedoubling ()                                                               │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#5  0x00002aaaab543afc in mca_io_romio_dist_MPI_File_set_view ()
#6  0x00002aaaab4735b8 in PMPI_Allreduce ()                                                                                                 │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#6  0x00002aaaab5139ab in mca_io_romio321_file_set_view ()
#7  0x00002aaaab543afc in mca_io_romio_dist_MPI_File_set_view ()                                                                            │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#7  0x00002aaaab483d68 in PMPI_File_set_view ()
#8  0x00002aaaab5139ab in mca_io_romio321_file_set_view ()                                                                                  │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#8  0x000000000073d5cd in H5FD__mpio_write (_file=0xceeb70, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=50340568, 
#9  0x00002aaaab483d68 in PMPI_File_set_view ()                                                                                             │    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1481
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceeb70, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, size=size@entry=1, 
#10 0x000000000073d5cd in H5FD__mpio_write (_file=0xceeb90, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=33558120,                     │    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248
    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1481                                                                            │#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf0260, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, 
#11 0x00000000004f2413 in H5FD_write (file=file@entry=0xceeb90, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,                    │    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248                                                                       │#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf0260, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, size=size@entry=1, 
#12 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf0270, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,      │    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826                                                                      │#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf0260, type=type@entry=H5FD_MEM_DRAW, addr=50340568, size=size@entry=1, 
#13 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf0270, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,                    │    buf=0x2aaaba5fb010) at H5Fio.c:205
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031                                                                         │#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1, 
#14 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf0270, type=type@entry=H5FD_MEM_DRAW, addr=33558120, size=size@entry=1,           │    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490
    buf=0x2aaaba5fb010) at H5Fio.c:205                                                                                                      │#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#15 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1,                       │    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f7f0, mpi_buf_type=0xd70830) at H5Dmpio.c:2124
    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490                                                                 │#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#16 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │    fm=fm@entry=0xd10780, sum_chunk=<optimized out>) at H5Dmpio.c:1234
    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f800, mpi_buf_type=0xd70840) at H5Dmpio.c:2124                                   │#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#17 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,    │    fm=fm@entry=0xd10780) at H5Dmpio.c:883
    fm=fm@entry=0xd10790, sum_chunk=<optimized out>) at H5Dmpio.c:1234                                                                      │#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>, 
#18 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,
    fm=fm@entry=0xd10790) at H5Dmpio.c:883
#19 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>,           
    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10790) at H5Dmpio.c:960                                                    │    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10780) at H5Dmpio.c:960
#20 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4630, mem_type_id=mem_type_id@entry=216172782113783850,                     │#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4620, mem_type_id=mem_type_id@entry=216172782113783850, mem_space=0xce4ff0, 
    mem_space=0xce4ff0, file_space=0xce2ee0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780                                  │    file_space=0xce2ee0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780
#21 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4630, mem_type_id=216172782113783850, mem_space_id=288230376151711748,        │#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4620, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206                     │    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206
#22 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4630, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850,                │#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4620, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151                                           │    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151
#23 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf44d0, mem_type_id=mem_type_id@entry=216172782113783850,             │#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf44c0, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185                                 │    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185
#24 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748,               │#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313                                        │    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313
#25 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98                                                               │#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98

After we found out that we can work around the crash mentioned above by setting the OpenMPI backend to ompio and that we have to use a release build of HDF5 to get around the crash described in Crash when freeing user-provided buffer on filter callback, we found that the minimal test also crashes if we set

#define _COMPRESS
#define CHUNK1 256
#define NCHUNK1 8192

This is the 1.12 branch with the patch from @jhenderson both as a debug and release build:

rank=1 writing dataset2
rank=3 writing dataset2
rank=2 writing dataset2
rank=0 writing dataset2
[cdr1042:149555:0:149555] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1211e95c)
==== backtrace ====rank=1 writing dataset2
rank=3 writing dataset2
rank=2 writing dataset2
rank=0 writing dataset2
[cdr1042:149555:0:149555] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1211e95c)
==== backtrace ====
 0 0x0000000000033280 killpg()  ???:0
 1 0x0000000000145c24 __memcpy_avx512_unaligned_erms()  ???:0
 2 0x000000000006ac5c opal_generic_simple_pack()  ???:0
 3 0x00000000000040cf ompi_mtl_psm2_isend()  ???:0
 4 0x00000000001c772b mca_pml_cm_isend()  pml_cm.c:0
 5 0x000000000011a1ef shuffle_init.isra.1()  fcoll_dynamic_gen2_file_write_all.c:0
 6 0x000000000011c11b mca_fcoll_dynamic_gen2_file_write_all()  ???:0
 7 0x00000000000bcd7e mca_common_ompio_file_write_at_all()  ???:0
 8 0x0000000000159b96 mca_io_ompio_file_write_at_all()  ???:0
 9 0x00000000000958a8 PMPI_File_write_at_all()  ???:0
10 0x000000000073d5b6 H5FD__mpio_write()  /scratch/rickn/hdf5/src/H5FDmpio.c:1466
11 0x00000000004f2424 H5FD_write()  /scratch/rickn/hdf5/src/H5FDint.c:248
12 0x000000000077eeb6 H5F__accum_write()  /scratch/rickn/hdf5/src/H5Faccum.c:826
13 0x00000000005ef5c8 H5PB_write()  /scratch/rickn/hdf5/src/H5PB.c:1031
14 0x00000000004d92d0 H5F_block_write()  /scratch/rickn/hdf5/src/H5Fio.c:251
15 0x000000000044d0ea H5C__flush_single_entry()  /scratch/rickn/hdf5/src/H5C.c:6109
16 0x000000000072d01b H5C__flush_candidates_in_ring()  /scratch/rickn/hdf5/src/H5Cmpio.c:1372
17 0x000000000072d989 H5C__flush_candidate_entries()  /scratch/rickn/hdf5/src/H5Cmpio.c:1193
18 0x000000000072f603 H5C_apply_candidate_list()  /scratch/rickn/hdf5/src/H5Cmpio.c:386
19 0x000000000072ace3 H5AC__propagate_and_apply_candidate_list()  /scratch/rickn/hdf5/src/H5ACmpio.c:1276
20 0x000000000072af40 H5AC__rsp__dist_md_write__flush_to_min_clean()  /scratch/rickn/hdf5/src/H5ACmpio.c:1835
21 0x000000000072cc0c H5AC__run_sync_point()  /scratch/rickn/hdf5/src/H5ACmpio.c:2157
22 0x0000000000422a89 H5AC_unprotect()  /scratch/rickn/hdf5/src/H5AC.c:1568
23 0x000000000075006b H5B__insert_helper()  /scratch/rickn/hdf5/src/H5B.c:1101
24 0x00000000007507fc H5B__insert_helper()  /scratch/rickn/hdf5/src/H5B.c:998
25 0x0000000000750e1f H5B_insert()  /scratch/rickn/hdf5/src/H5B.c:596
26 0x0000000000753dde H5D__btree_idx_insert()  /scratch/rickn/hdf5/src/H5Dbtree.c:1009
27 0x0000000000735772 H5D__link_chunk_filtered_collective_io()  /scratch/rickn/hdf5/src/H5Dmpio.c:1462
28 0x0000000000739abe H5D__chunk_collective_io()  /scratch/rickn/hdf5/src/H5Dmpio.c:878
29 0x000000000073a52a H5D__chunk_collective_write()  /scratch/rickn/hdf5/src/H5Dmpio.c:960
30 0x00000000004955bd H5D__write()  /scratch/rickn/hdf5/src/H5Dio.c:780
31 0x00000000007038e9 H5VL__native_dataset_write()  /scratch/rickn/hdf5/src/H5VLnative_dataset.c:206
32 0x00000000006e36f3 H5VL__dataset_write()  /scratch/rickn/hdf5/src/H5VLcallback.c:2151
33 0x00000000006ecab6 H5VL_dataset_write()  /scratch/rickn/hdf5/src/H5VLcallback.c:2185
34 0x0000000000493da0 H5Dwrite()  /scratch/rickn/hdf5/src/H5Dio.c:313
35 0x0000000000404183 main()  /scratch/rickn/test_hdf5/test_orig.c:111
36 0x00000000000202e0 __libc_start_main()  ???:0
37 0x0000000000403c5a _start()  /tmp/nix-build-glibc-2.24.drv-0/glibc-2.24/csu/../sysdeps/x86_64/start.S:120
===================

With the release of hdf5 1.10.7, I wanted to run the tests again to see if anything changed. I found that the minimal test from the top of this thread fails both with ompio and romio321:

$mpirun -np 4 --mca io ompio ./testh5
MPI rank [0/4]
rank=0 creating file
MPI rank [1/4]
rank=1 creating file
MPI rank [2/4]
rank=2 creating file
MPI rank [3/4]
rank=3 creating file
rank=0 creating selection [0:4, 0:4194304]
rank=1 creating selection [4:8, 0:4194304]
rank=2 creating selection [8:12, 0:4194304]
rank=3 creating selection [12:16, 0:4194304]
rank=2 creating dataset1
rank=0 creating dataset1
rank=1 creating dataset1
rank=3 creating dataset1
rank=0 writing dataset1
rank=2 writing dataset1
rank=3 writing dataset1
rank=1 writing dataset1
rank=2 finished writing dataset1
rank=2 creating dataset2
rank=0 finished writing dataset1
rank=0 creating dataset2
rank=3 finished writing dataset1
rank=3 creating dataset2
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 2:
  #000: H5D.c line 152 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 338 in H5D__create_named(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #002: H5L.c line 1605 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1846 in H5L__create_real(): can't insert link
    major: Links
    minor: Unable to insert object
  #004: H5Gtraverse.c line 848 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: H5Gtraverse.c line 579 in H5G__traverse_real(): can't look up component
    major: Symbol table
    minor: Object not found
  #006: H5Gobj.c line 1118 in H5G__obj_lookup(): can't check for link info message
    major: Symbol table
    minor: Can't get value
  #007: H5Gobj.c line 324 in H5G__obj_get_linfo(): unable to read object header
    major: Symbol table
    minor: Can't get value
  #008: H5Omessage.c line 873 in H5O_msg_exists(): unable to protect object header
    major: Object header
    minor: Unable to protect metadata
  #009: H5Oint.c line 1056 in H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
  #010: H5AC.c line 1517 in H5AC_protect(): H5C_protect() failed
    major: Object cache
    minor: Unable to protect metadata
  #011: H5C.c line 2454 in H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed
  #012: H5C.c line 2454 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    major: Internal error (too specific to document in detail)
    minor: MPI Error String
rank=2 writing dataset2
rank=1 finished writing dataset1
rank=1 creating dataset2
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 3:
  #000: H5D.c line 152 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 338 in H5D__create_named(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #002: H5L.c line 1605 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1846 in H5L__create_real(): can't insert link
    major: Links
    minor: Unable to insert object
  #004: H5Gtraverse.c line 848 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: H5Gtraverse.c line 579 in H5G__traverse_real(): can't look up component
    major: Symbol table
    minor: Object not found
  #006: H5Gobj.c line 1118 in H5G__obj_lookup(): can't check for link info message
    major: Symbol table
    minor: Can't get value
  #007: H5Gobj.c line 324 in H5G__obj_get_linfo(): unable to read object header
    major: Symbol table
    minor: Can't get value
  #008: H5Omessage.c line 873 in H5O_msg_exists(): unable to protect object header
    major: Object header
    minor: Unable to protect metadata
  #009: H5Oint.c line 1056 in H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
  #010: H5AC.c line 1517 in H5AC_protect(): H5C_protect() failed
    major: Object cache
    minor: Unable to protect metadata
  #011: H5C.c line 2454 in H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed
  #012: H5C.c line 2454 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    major: Internal error (too specific to document in detail)
    minor: MPI Error String
rank=3 writing dataset2
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 3:
  #000: H5Dio.c line 313 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 3:
  #000: H5D.c line 334 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=3 closing everything
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 2:
  #000: H5Dio.c line 313 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 2:
  #000: H5D.c line 334 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=2 closing everything
``

Hi again @jrichardshaw, @rick and @wkliao,

after an unfortunately long (2 years!) time, I’ve finally been able to get back to looking at this part of HDF5 and was able to come up with a fix specifically for the cases above where an MPI_ERR_TRUNCATE issue/crash was happening. With this fix in place, I am able to run the small example in this thread under HDF5 1.13, 1.12 and 1.10 with all chunking parameters available in the example program and with compression both enabled and disabled. This fix is being merged to the development branches for HDF5 1.13, 1.12 and 1.10.

While this is hopefully good news, there are two concerns I have in regards to this fix and the overall parallel compression feature that I hope others may be able to comment on or help with:

  1. I never encountered the MPI_ERR_TRUNCATE issue with compression enabled, only with it disabled. With compression disabled, the example in this thread always seemed to run fine for me. However, one of @jrichardshaw’s earlier posts seemed to hint that an error was still encountered with certain chunking parameters and compression enabled. If this is still the case after this fix, I’d definitely be interested in knowing about it since I’m actively working on the parallel compression feature.

  2. There seem to be a few other types of crashes reported in this thread that appear unrelated to the MPI_ERR_TRUNCATE issue. While at least one appears to have been related to the I/O backend that MPI was using, it is unclear whether all of the other issues have been resolved. So again, if these issues are still encountered after my fix is in place, please let me know so we can try to figure out the issue.

1 Like

Hi @jhenderson !

I have been trying to write large 3D array of the shape (3, Superlarge, 4000) in parallel where I split Superlarge into digestible chunks. Superlarge could be >10,000,000. For smaller arrays I do not get this address not mapped to object at address error, only for large arrays. Are there any limitations on the compressed, chunked, parallel writing that I’m not aware of?

Thanks!

Cheers,

Lucas

How are you parallelizing your code?

I’m using the Python API h5py. I posted quite a bit about it in their issues before coming here: #2182 and #2211.

I could serialize the writing, but to compress the array the file needs to be open, and doing the compression in serial for my dataset is just not feasible in a reasonable timeframe.

Hi @lsawadel,

in terms of amount of data, there aren’t (or shouldn’t be) any limitations inherent to the parallel writing of filtered data, other than what the system can handle in terms of memory usage. Does the segfault you’re seeing seem to be identical to the one mentioned previously in this thread? If not, I’d suggest starting a new topic with as much detail as you can provide since the original error for this thread should have been resolved in the latest HDF5 major releases.

Since a major overhaul was made to the feature in the 1.10.9, 1.12.2 and 1.13.1 release, it’s possible a bug was introduced (based on your comments about versions tested in #2211), but the bug may have also been present in HDF5 1.12.1 without being triggered.

Thanks for taking your time to check out my tests!

The error seems identical, but the seg fault happens at a much larger dimension than before. I had been testing for a while without any issues whatsoever on small datasets and this only appeared once I tried to scale up my code.

One observation that I have made is that as long as I keep the 2nd dimension under (50,000) on each core so each core writes arrays like (3, <~50,000, 4,000) the code seems to be mostly fine.

Another observation that I have made so far is that the seg fault is more likely to happen if I use lzf compression compiled as part of the h5py compilation than if I use gzip. I have not really tried any other compression algorithms so far.

@lsawade What is the datatype of the array you are trying to write out? Giving the shape info is good but without the datatype we cannot know the total size (bytes).

-Aleksandar

Hi!
I have been testing with float16, float32, unit16, uint32, all of which show similar behavior, but(!), e.g., float32 would crash at approximately half of the array size as float16, which made me believe that it was a memory issue at first, but running the code a cross multiple nodes with a ton of memory made me realize that that is not the case. I will prep a Github repo with scripts on how I download, configure, and compile HDF5 as well as a failing code this afternoon. So, that we have a common startpoint. Thanks already for checking in with me!

Hi everyone,

I have been working with the attached repo to test what I need/want. I’m testing this on a large cluster node with 600GB of RAM so that should not be the issue here. The script runs with 3 ranks, but in the real case scenario, it will be many more but with the same setup, just a much larger array in the second dimension.

I have tested this on both Intel and IBM Power9 and different compilers, and it does not seem to be a platform dependent issue. I’d be very happy if someone just looks at this at tells me “you dummy, you can’t xyz…”. If there are segfault unrelated things, that I could be doing better, I also don’t mind input. I’m relatively new to HDF5/h5py, so every tip helps.

The specific commit I made yesterday is here commit for posterity.