Crash when writing parallel compressed chunks


#21

Just an observation from a run compiled with 1.10.6 + the patch provided by @jhenderson earlier in this discussion chain.

When the 2nd H5Dcreate call is moved to before the 1st H5Dwrite in @jrichardshaw’s test program, the error was gone. It looks like the problem occurs when calls to H5Dcreate and H5Dwrite are interleaved.

After reading into file H5C.c and adding a few printf statements, it appears that values of entry_ptr->coll_access checked in line 2271 are not consistent among the 4 running processes, which causes only 2 of the 4 processes calling MPI_Bcast at line 2297, and thus the error.


#22

Just to follow up. This recent set of parameters in the Gist also fails on HDF5 1.12.0 (a separate discussion with some HDF5 staff suggested it might be fixed and got my hopes up). Crash output (although I think it’s largely the same):

MPI rank [0/4]
rank=0 creating file
MPI rank [1/4]
rank=1 creating file
MPI rank [2/4]
rank=2 creating file
MPI rank [3/4]
rank=3 creating file
rank=0 creating selection [0:4, 0:4194304]
rank=0 creating dataset1
rank=1 creating selection [4:8, 0:4194304]
rank=1 creating dataset1
rank=2 creating selection [8:12, 0:4194304]
rank=2 creating dataset1
rank=3 creating selection [12:16, 0:4194304]
rank=3 creating dataset1
rank=1 writing dataset1
rank=2 writing dataset1
rank=0 writing dataset1
rank=3 writing dataset1
rank=3 finished writing dataset1
rank=3 waiting at barrier
rank=0 finished writing dataset1
rank=0 waiting at barrier
rank=1 finished writing dataset1
rank=1 waiting at barrier
rank=0 creating dataset2
rank=2 finished writing dataset1
rank=2 waiting at barrier
rank=2 creating dataset2
rank=3 creating dataset2
rank=1 creating dataset2
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-processHDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  # 2:
  #000: H5D.c line 151 in H5Dcreate2(): unable to create dataset
    major: 000: H5D.c line 151 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5VLcallback.c line 1869 in H5VL_dataset_create(): dataset create failed
    major:Dataset
    minor: Unable to initialize object
  #001: H5VLcallback.c line 1869 in H5VL_dataset_create(): dataset create failed
    major: Virtual Object Layer
    minor: Unable to create file
  # Virtual Object Layer
    minor: Unable to create file
  #002: H5VLcallback.c line 1835 in H5VL__dataset_create(): dataset create failed
    ma002: H5VLcallback.c line 1835 in H5VL__dataset_create(): dataset create failed
    major: Virtual Object Layer
    minor: Unable to create file
  #003: H5VLnative_dataset.c line jor: Virtual Object Layer
    minor: Unable to create file
  #003: H5VLnative_dataset.c line 75 in H5VL__native_dataset_create(): unable to create dataset
    major:75 in H5VL__native_dataset_create(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #004 Dataset
    minor: Unable to initialize object
  #004: H5Dint.c line 411 in H5D__create_named: H5Dint.c line 411 in H5D__create_named(): unable to create and link
 to dataset
    major: Dataset
    m(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #inor: Unable to initialize object
  #005: H5L.c line 1804 in H5L_link_object(): 005: H5L.c line 1804 in H5L_link_object(): unable to create new link
to object
    major: unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #006: H5L.c lLinks
    minor: Unable to initialize object
  #006: H5L.c line 2045 in H5L__create_realine 2045 in H5L__create_real(): can't insert link
    major: Links
    minor(): can't insert link
    major: Links
    minor: Unable to insert object
  #007: Unable to insert object
  #007: H5Gtraverse.c line 855 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: H5Gtraverse.c line 855 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #008: : Object not found
  #008: H5Gtraverse.c line 585 in H5G__traverse_real(): can't look up component
    major:H5Gtraverse.c line 585 in H5G__traverse_real(): can't look up component
    major: Symbol table
    minor: Object not found
  #009:  Symbol table
    minor: Object not found
  #009: H5Gobj.c line 1125 in H5G__obj_lookup(): can't check for link info message
    majorH5Gobj.c line 1125 in H5G__obj_lookup(): can't check for link info message
    major: Symbol table
    min: Symbol table
    minor: Can't get value
  #010: H5Gobj.c line 326 in H5G__obj_get_linfoor: Can't get value
  #010: H5Gobj.c line 326 in H5G__obj_get_linfo(): (): unable to read object header
    major: Symbol table
    minor: Can't get value
  #unable to read object header
    major: Symbol table
    minor: Can't get value
  #011: 011: H5Omessage.c line 883 in H5O_msg_exists(): unable to protect object header
    major: H5Omessage.c line 883 in H5O_msg_exists(): unable to protect object header
    major: Object header
Object header
    minor: Unable to protect metadata
  #012: H5Oint.c line 1082 i    minor: Unable to protect metadata
  #012: H5Oint.c line 1082 in n H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
H5O_protect(): unable to load object header
    major: Object header
    minor: Unable to protect metadata
  #  #013: H5AC.c line 1312 in H5AC_protect(): H5C_protect() failed
    majo013: H5AC.c line 1312 in H5AC_protect(): H5C_protect() failed
    major: r: Object cache
    minor: Unable to protect metadata
  #014: H5C.c line 2299 Object cache
    minor: Unable to protect metadata
  #014: H5C.c line 2299 inin H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed H5C_protect(): MPI_Bcast failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed

  #015: H5C.c line 2299 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    maj#015: H5C.c line 2299 in H5C_protect(): MPI_ERR_TRUNCATE: message truncated
    major: or: Internal error (too specific to document in detail)
    minor: MPI Error String
rank=2 writing dataset2
rank=3 writing dataset2
Internal error (too specific to document in detail)
    minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 2:
  #000: H5Dio.c line 300 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 2:
  #000: H5D.c line 332 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=2 closing everything
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  #000: H5Dio.c line 300 in H5Dwrite(): dset_id is not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.0) MPI-process 3:
  #000: H5D.c line 332 in H5Dclose(): not a dataset ID
    major: Invalid arguments to routine
    minor: Inappropriate type
rank=3 closing everything

#23

Hi @jrichardshaw,

unfortunately there hasn’t been much time to look at this. However, we do know of some other folks that are looking for a fix to this issue as well. Based on @wkliao’s observation, I’m fairly certain that it’s just a problem of needing to insert barriers in the appropriate place in the library’s code. I remember having an issue reproducing this using your example, so I wasn’t quite able to determine if this really was the source of the issue, but I’m thinking that running several rounds of

H5Dcreate(...);
H5Dwrite(...);

should eventually produce the issue for me. In any case, I believe there should be more info on this issue relatively soon.


#24

Hi again @jrichardshaw, @wkliao and others in this thread. I’ve narrowed down the cause of this issue and will have a small patch to post after I’ve discussed the fix with other developers. Provided that that patch works here and doesn’t cause further issues, we should be able to get the fix in quickly afterwards.


#25

Wonderful. Thanks @jhenderson! I’ll be happy to test the patch whenever you post it.


#26

Hi @jrichardshaw and @wkliao,

attached is a small patch against the 1.12 branch that temporarily disables the collective metadata reads feature in HDF5, which should make the issue disappear for now. However, this is only a temporary fix and may potentially affect performance. The issue stems from an oversight in the design of the collective metadata reads feature that has effectively been masked until recently and it will need to be fixed. While this feature wasn’t specifically enabled in your test program, there are some cases in the library where we implicitly turn the feature on due to metadata modifications needing to be collective, such as for H5Dcreate. That behavior, combined with your chosen chunk size and number of chunks was right on the line needed to cause the issue to appear. The timeline on fixing this correctly isn’t clear yet, but we hope to be able to fix this in time for the next release of HDF5.

disable_coll_md_reads.patch (480 Bytes)


#27

Thanks for the path @jhenderson. We’ve been testing the patch but we’re still having failures. One of my colleagues has posted a fuller description (post is awaiting approval), but what we’re finding is that it works will the nominal test case above, but if we go back to the first set of parameters (CHUNK1=32768; NCHUNK1=32), that it hangs. This seems more similar to the first issue found in this thread.

Anyway, I think my colleagues pending post has more details (including stack traces), so I won’t try and repeat them here.


#28

Thanks for the latest patch @jhenderson
I applied it to both the HEAD of the hdf5_1_12 branch as well as the tag hdf5-1_12_0.
Unfortunately the minimal test supplied by @jrichardshaw still hangs if built against these two if I uncomment

// Equivalent to original gist
// Works on 1.10.5 with patch, crashes on 1.10.5 vanilla and hangs on 1.10.6
#define CHUNK1 32768
#define NCHUNK1 32

This is the stack trace I got using tmpi 4 gdb ./testh5:

#0  0x00002aaaab49e9a7 in PMPI_Type_size_x ()                                                                                               │#0  0x00002aaaab49e994 in PMPI_Type_size_x ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#1  0x00002aaaab52d0f3 in ADIOI_GEN_WriteContig ()                                                                                          │#1  0x00002aaaab52d0f3 in ADIOI_GEN_WriteContig ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#2  0x00002aaaab531323 in ADIOI_GEN_WriteStrided ()                                                                                         │#2  0x00002aaaab531323 in ADIOI_GEN_WriteStrided ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#3  0x00002aaaab52faab in ADIOI_GEN_WriteStridedColl ()                                                                                     │#3  0x00002aaaab52faab in ADIOI_GEN_WriteStridedColl ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#4  0x00002aaaab544fac in MPIOI_File_write_all ()                                                                                           │#4  0x00002aaaab544fac in MPIOI_File_write_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#5  0x00002aaaab545531 in mca_io_romio_dist_MPI_File_write_at_all ()                                                                        │#5  0x00002aaaab545531 in mca_io_romio_dist_MPI_File_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#6  0x00002aaaab514922 in mca_io_romio321_file_write_at_all ()                                                                              │#6  0x00002aaaab514922 in mca_io_romio321_file_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#7  0x00002aaaab4848a8 in PMPI_File_write_at_all ()                                                                                         │#7  0x00002aaaab4848a8 in PMPI_File_write_at_all ()
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
#8  0x000000000073d5a5 in H5FD__mpio_write (_file=0xceec90, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=3688, size=<optimized out>,   │#8  0x000000000073d5a5 in H5FD__mpio_write (_file=0xceec30, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=16780904, 
    buf=0x2aaaba5fb010) at H5FDmpio.c:1466                                                                                                  │    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1466
#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceec90, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688, size=size@entry=1,     │#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceec30, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, size=size@entry=1, 
    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248                                                                                          │    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248
#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf02d0, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688,          │#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf02d0, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, 
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826                                                                      │    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826
#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=3688, size=size@entry=1,     │#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=16780904, size=size@entry=1, 
    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031                                                                                            │    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031
#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=3688, size=size@entry=1,               │#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf02d0, type=type@entry=H5FD_MEM_DRAW, addr=16780904, size=size@entry=1, 
    buf=0x2aaaba5fb010) at H5Fio.c:205                                                                                                      │    buf=0x2aaaba5fb010) at H5Fio.c:205
#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1,                       │#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1, 
    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490                                                                 │    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490
#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd70760, mpi_buf_type=0xd717a0) at H5Dmpio.c:2124                                   │    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f8d0, mpi_buf_type=0xd70910) at H5Dmpio.c:2124
#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,    │#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    fm=fm@entry=0xd110c0, sum_chunk=<optimized out>) at H5Dmpio.c:1234                                                                      │    fm=fm@entry=0xd10800, sum_chunk=<optimized out>) at H5Dmpio.c:1234
#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
    fm=fm@entry=0xd110c0) at H5Dmpio.c:883                                                                                                  │    fm=fm@entry=0xd10800) at H5Dmpio.c:883
#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>,            │#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>, 
    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd110c0) at H5Dmpio.c:960                                                    │    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10800) at H5Dmpio.c:960
#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4db0, mem_type_id=mem_type_id@entry=216172782113783850,                     │#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf46e0, mem_type_id=mem_type_id@entry=216172782113783850, mem_space=0xce4fd0, 
    mem_space=0xce5050, file_space=0xce2f40, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780                                  │    file_space=0xce2ec0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780
#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4db0, mem_type_id=216172782113783850, mem_space_id=288230376151711748,        │#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf46e0, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206                     │    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206
#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4db0, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850,                │#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf46e0, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151                                           │    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151
#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf4c50, mem_type_id=mem_type_id@entry=216172782113783850,             │#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf4580, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185                                 │    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185
#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748,               │#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313                                        │    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313
#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98                                                               │#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
#0  0x00002aaab8220b46 in psm2_mq_ipeek2 () from /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib/libpsm2.so.2                   │#0  0x00002aaaabf92de8 in opal_progress ()
#1  0x00002aaab8002409 in ompi_mtl_psm2_progress ()                                                                                         │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libopen-pal.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/openmpi/mca_mtl_psm2.so                   │#1  0x00002aaaab45f435 in ompi_request_default_wait ()
#2  0x00002aaaabf92e0b in opal_progress ()                                                                                                  │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libopen-pal.so.40                         │#2  0x00002aaaab4bf303 in ompi_coll_base_sendrecv_actual ()
#3  0x00002aaaab45f435 in ompi_request_default_wait ()                                                                                      │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#3  0x00002aaaab4bf739 in ompi_coll_base_allreduce_intra_recursivedoubling ()
#4  0x00002aaaab4bf303 in ompi_coll_base_sendrecv_actual ()                                                                                 │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#4  0x00002aaaab4735b8 in PMPI_Allreduce ()
#5  0x00002aaaab4bf739 in ompi_coll_base_allreduce_intra_recursivedoubling ()                                                               │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#5  0x00002aaaab543afc in mca_io_romio_dist_MPI_File_set_view ()
#6  0x00002aaaab4735b8 in PMPI_Allreduce ()                                                                                                 │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#6  0x00002aaaab5139ab in mca_io_romio321_file_set_view ()
#7  0x00002aaaab543afc in mca_io_romio_dist_MPI_File_set_view ()                                                                            │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#7  0x00002aaaab483d68 in PMPI_File_set_view ()
#8  0x00002aaaab5139ab in mca_io_romio321_file_set_view ()                                                                                  │   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#8  0x000000000073d5cd in H5FD__mpio_write (_file=0xceeb70, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=50340568, 
#9  0x00002aaaab483d68 in PMPI_File_set_view ()                                                                                             │    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1481
   from /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc8/openmpi/4.0.1/lib/libmpi.so.40                              │#9  0x00000000004f2413 in H5FD_write (file=file@entry=0xceeb70, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, size=size@entry=1, 
#10 0x000000000073d5cd in H5FD__mpio_write (_file=0xceeb90, type=H5FD_MEM_DRAW, dxpl_id=<optimized out>, addr=33558120,                     │    buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248
    size=<optimized out>, buf=0x2aaaba5fb010) at H5FDmpio.c:1481                                                                            │#10 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf0260, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, 
#11 0x00000000004f2413 in H5FD_write (file=file@entry=0xceeb90, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,                    │    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5FDint.c:248                                                                       │#11 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf0260, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=50340568, size=size@entry=1, 
#12 0x000000000077eea5 in H5F__accum_write (f_sh=f_sh@entry=0xcf0270, map_type=map_type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,      │    buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5Faccum.c:826                                                                      │#12 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf0260, type=type@entry=H5FD_MEM_DRAW, addr=50340568, size=size@entry=1, 
#13 0x00000000005ef5b7 in H5PB_write (f_sh=f_sh@entry=0xcf0270, type=type@entry=H5FD_MEM_DRAW, addr=addr@entry=33558120,                    │    buf=0x2aaaba5fb010) at H5Fio.c:205
    size=size@entry=1, buf=buf@entry=0x2aaaba5fb010) at H5PB.c:1031                                                                         │#13 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1, 
#14 0x00000000004d9079 in H5F_shared_block_write (f_sh=0xcf0270, type=type@entry=H5FD_MEM_DRAW, addr=33558120, size=size@entry=1,           │    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490
    buf=0x2aaaba5fb010) at H5Fio.c:205                                                                                                      │#14 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#15 0x000000000073a113 in H5D__mpio_select_write (io_info=0x7fffffff82e0, type_info=<optimized out>, mpi_buf_count=1,                       │    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f7f0, mpi_buf_type=0xd70830) at H5Dmpio.c:2124
    file_space=<optimized out>, mem_space=<optimized out>) at H5Dmpio.c:490                                                                 │#15 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#16 0x0000000000730e2b in H5D__final_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,         │    fm=fm@entry=0xd10780, sum_chunk=<optimized out>) at H5Dmpio.c:1234
    mpi_buf_count=mpi_buf_count@entry=1, mpi_file_type=0xd6f800, mpi_buf_type=0xd70840) at H5Dmpio.c:2124                                   │#16 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260, 
#17 0x0000000000736129 in H5D__link_chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,    │    fm=fm@entry=0xd10780) at H5Dmpio.c:883
    fm=fm@entry=0xd10790, sum_chunk=<optimized out>) at H5Dmpio.c:1234                                                                      │#17 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>, 
#18 0x0000000000739b11 in H5D__chunk_collective_io (io_info=io_info@entry=0x7fffffff82e0, type_info=type_info@entry=0x7fffffff8260,
    fm=fm@entry=0xd10790) at H5Dmpio.c:883
#19 0x000000000073a519 in H5D__chunk_collective_write (io_info=0x7fffffff82e0, type_info=0x7fffffff8260, nelmts=<optimized out>,           
    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10790) at H5Dmpio.c:960                                                    │    file_space=<optimized out>, mem_space=<optimized out>, fm=0xd10780) at H5Dmpio.c:960
#20 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4630, mem_type_id=mem_type_id@entry=216172782113783850,                     │#18 0x00000000004955ac in H5D__write (dataset=dataset@entry=0xcf4620, mem_type_id=mem_type_id@entry=216172782113783850, mem_space=0xce4ff0, 
    mem_space=0xce4ff0, file_space=0xce2ee0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780                                  │    file_space=0xce2ee0, buf=<optimized out>, buf@entry=0x2aaaba5fb010) at H5Dio.c:780
#21 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4630, mem_type_id=216172782113783850, mem_space_id=288230376151711748,        │#19 0x00000000007038d8 in H5VL__native_dataset_write (obj=0xcf4620, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206                     │    file_space_id=288230376151711747, dxpl_id=<optimized out>, buf=0x2aaaba5fb010, req=0x0) at H5VLnative_dataset.c:206
#22 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4630, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850,                │#20 0x00000000006e36e2 in H5VL__dataset_write (obj=0xcf4620, cls=0xac3520, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151                                           │    dxpl_id=dxpl_id@entry=792633534417207318, buf=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2151
#23 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf44d0, mem_type_id=mem_type_id@entry=216172782113783850,             │#21 0x00000000006ecaa5 in H5VL_dataset_write (vol_obj=vol_obj@entry=0xcf44c0, mem_type_id=mem_type_id@entry=216172782113783850, 
    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747,                               │    mem_space_id=mem_space_id@entry=288230376151711748, file_space_id=file_space_id@entry=288230376151711747, 
    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185                                 │    dxpl_id=dxpl_id@entry=792633534417207318, buf=buf@entry=0x2aaaba5fb010, req=0x0) at H5VLcallback.c:2185
#24 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748,               │#22 0x0000000000493d8f in H5Dwrite (dset_id=<optimized out>, mem_type_id=216172782113783850, mem_space_id=288230376151711748, 
    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313                                        │    file_space_id=288230376151711747, dxpl_id=792633534417207318, buf=0x2aaaba5fb010) at H5Dio.c:313
#25 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98                                                               │#23 0x0000000000404096 in main (argc=1, argv=0x7fffffff8728) at test_ph5.c:98