Hello,
test code is here:
henry / mpi_test_perf · GitLab.
Tested with 2^32 array 1d is ok, but with 2^33, it works for 1 and 2 procs, but it fails with 4 procs. Using gdb as described here (FAQ: Debugging applications in parallel), i got the following messages:
writeH5compressed:46 dims: 8589934592
[New Thread 0x7ffff5591740 (LWP 1715479)]
Thread 1 "ph5_dataset" received signal SIGSEGV, Segmentation fau
__memmove_avx_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
262 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.Sr directory.
(gdb) where
#0 __memmove_avx_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:
#1 0x00007ffff77b01c5 in opal_convertor_unpack ()
from /lib/x86_64-linux-gnu/libopen-pal.so.40
#2 0x00007ffff5c365df in mca_pml_ob1_recv_request_progress_frag
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#3 0x00007ffff5c08a87 in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_s
#4 0x00007ffff5c389a8 in mca_pml_ob1_send_request_schedule_once
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#5 0x00007ffff5c31429 in mca_pml_ob1_recv_frag_callback_ack ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#6 0x00007ffff5c08a87 in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_s
#7 0x00007ffff5c32ca8 in mca_pml_ob1_recv_request_ack_send_btl
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#8 0x00007ffff5c3348c in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#9 0x00007ffff5c35b50 in mca_pml_ob1_recv_request_progress_rndv
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#10 0x00007ffff5c2f84e in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#11 0x00007ffff5c2faa0 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#12 0x00007ffff5c08a87 in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_s
#13 0x00007ffff5c3b3ae in mca_pml_ob1_send_request_start_rndv ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#14 0x00007ffff5c2b9e1 in mca_pml_ob1_isend ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_o
#15 0x00007ffff5bd119e in ?? ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_fcoll
#16 0x00007ffff5bd377b in mca_fcoll_vulcan_file_write_all ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_fcoll
#17 0x00007ffff5bbddc0 in mca_common_ompio_file_write_at_all ()
from /lib/x86_64-linux-gnu/libmca_common_ompio.so.41
#18 0x00007ffff5c1924b in mca_io_ompio_file_write_at_all ()
from /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_io_om
#19 0x00007ffff7ed7d80 in PMPI_File_write_at_all ()
from /lib/x86_64-linux-gnu/libmpi.so.40
#20 0x00005555557e0f0b in H5FD__mpio_write ()
#21 0x0000555555608fe9 in H5FD_write ()
#22 0x0000555555813284 in H5F__accum_write ()
#23 0x00005555556b828b in H5PB_write ()
#24 0x00005555555f950a in H5F_shared_block_write ()
#25 0x00005555557de51c in H5D__mpio_select_write ()
--Type <RET> for more, q to quit, c to continue without paging--
#26 0x00005555557d3676 in H5D__final_collective_io ()
#27 0x00005555557deaf3 in H5D__contig_collective_write ()
#28 0x00005555555c736c in H5D__write ()
#29 0x00005555557a4651 in H5VL__native_dataset_write ()
#30 0x000055555578f617 in H5VL_dataset_write ()
#31 0x00005555555c5fea in H5Dwrite ()
#32 0x0000555555572fff in writeH5compressed (str=0x7fffffffd870
data=0x7ff7e7fff010, dimsf=0x7fffffffd858,
fichier=0x55555583c260 "SDScomp1d.h5", compressed=false)
at /home/henry/projets/mpi_test_perf/ph5_file_utils.c:154
#33 0x000055555556db06 in main ()
if aybody can help?
Thanks in advance,
Gérard
Hi @gerard.henry,
can you share information such as the versions of HDF5 and OpenMPI, as well as the command used to run the test program? I just built your example with the latest develop branch of HDF5 and OpenMPI 5.0.5 and then ran the test program as: mpirun -np 4 ./ph5_dataset 33 ph5_dataset_33_4.h5
without a crash. Among different versions there are sometimes issues with HDF5 and sometimes issues with OpenMPI, so, if possible, it would be good to try with more recent versions of each.
i tested on two platforms:
sequential machine Ubuntu 20.04.6 LTS, Open MPI: 4.0.3, HDF5 1.12.3
and
cluster of CentOS 7.9, OpenMPI 4.0.5 and HDF5 1.12.3
the command to run the test program is exactly what you ran, even if i use slurm on the machines
On the cluster, it’s very difficult to recompile OpenMPI, and we have no support to help. But i try the latest HDF5 like you.
thanks for your reply
Gérard
Hi Jordan,
could you confirm that you build the code without modification?
on Ubuntu 20, with HDF5-1.14.4.3, latest release on github, i got the following errors:
In file included from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:11:
/home/henry/projets/mpi_test_perf/ph5_file_utils.h:16:28: error: conflicting types for ‘hsize_t’
16 | typedef unsigned long long hsize_t;
| ^~~~~~~
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:21,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5public.h:297:18: note: previous declaration of ‘hsize_t’ was here
297 | typedef uint64_t hsize_t;
| ^~~~~~~
/home/henry/projets/mpi_test_perf/ph5_file_utils.c: In function ‘writeH5compressed’:
/home/henry/projets/mpi_test_perf/ph5_file_utils.c:86:37: warning: passing argument 2 of ‘H5Screate_simple’ from incompatible pointer type [-Wincompatible-pointer-types]
86 | filespace = H5Screate_simple(1, dimsf, NULL);
| ^~~~~
| |
| const hsize_t * {aka const long long unsigned int *}
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Ppublic.h:29,
from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:35,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Spublic.h:323:55: note: expected ‘const hsize_t *’ {aka ‘const long unsigned int *’} but argument is of type ‘const hsize_t *’ {aka ‘const long long unsigned int *’}
323 | H5_DLL hid_t H5Screate_simple(int rank, const hsize_t dims[], const hsize_t maxdims[]);
| ~~~~~~~~~~~~~~^~~~~~
/home/henry/projets/mpi_test_perf/ph5_file_utils.c:104:44: warning: passing argument 3 of ‘H5Pset_chunk’ from incompatible pointer type [-Wincompatible-pointer-types]
104 | status = H5Pset_chunk(plist_id, 1, &chunk_dim);
| ^~~~~~~~~~
| |
| hsize_t * {aka long long unsigned int *}
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:35,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Ppublic.h:6425:69: note: expected ‘const hsize_t *’ {aka ‘const long unsigned int *’} but argument is of type ‘hsize_t *’ {aka ‘long long unsigned int *’}
6425 | H5_DLL herr_t H5Pset_chunk(hid_t plist_id, int ndims, const hsize_t dim[/*ndims*/]);
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~
/home/henry/projets/mpi_test_perf/ph5_file_utils.c:145:40: warning: passing argument 2 of ‘H5Screate_simple’ from incompatible pointer type [-Wincompatible-pointer-types]
145 | hid_t mspace = H5Screate_simple(1, &count, NULL);
| ^~~~~~
| |
| hsize_t * {aka long long unsigned int *}
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Ppublic.h:29,
from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:35,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Spublic.h:323:55: note: expected ‘const hsize_t *’ {aka ‘const long unsigned int *’} but argument is of type ‘hsize_t *’ {aka ‘long long unsigned int *’}
323 | H5_DLL hid_t H5Screate_simple(int rank, const hsize_t dims[], const hsize_t maxdims[]);
| ~~~~~~~~~~~~~~^~~~~~
/home/henry/projets/mpi_test_perf/ph5_file_utils.c:146:49: warning: passing argument 3 of ‘H5Sselect_hyperslab’ from incompatible pointer type [-Wincompatible-pointer-types]
146 | H5Sselect_hyperslab(wspace, H5S_SELECT_SET, &offset, NULL, &count, NULL);
| ^~~~~~~
| |
| hsize_t * {aka long long unsigned int *}
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Ppublic.h:29,
from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:35,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Spublic.h:1213:83: note: expected ‘const hsize_t *’ {aka ‘const long unsigned int *’} but argument is of type ‘hsize_t *’ {aka ‘long long unsigned int *’}
1213 | L herr_t H5Sselect_hyperslab(hid_t space_id, H5S_seloper_t op, const hsize_t start[],
| ~~~~~~~~~~~~~~^~~~~~~
/home/henry/projets/mpi_test_perf/ph5_file_utils.c:146:64: warning: passing argument 5 of ‘H5Sselect_hyperslab’ from incompatible pointer type [-Wincompatible-pointer-types]
146 | H5Sselect_hyperslab(wspace, H5S_SELECT_SET, &offset, NULL, &count, NULL);
| ^~~~~~
| |
| hsize_t * {aka long long unsigned int *}
In file included from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Ppublic.h:29,
from /home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/hdf5.h:35,
from /home/henry/projets/mpi_test_perf/ph5_file_utils.c:6:
/home/henry/LIBRARY_PARA/LIBRARIES/hdf5-1.14.4.3/include/H5Spublic.h:1214:73: note: expected ‘const hsize_t *’ {aka ‘const long unsigned int *’} but argument is of type ‘hsize_t *’ {aka ‘long long unsigned int *’}
1214 | const hsize_t stride[], const hsize_t count[], const hsize_t block[]);
| ~~~~~~~~~~~~~~^~~~~~~
make[3]: *** [CMakeFiles/ph5_dataset.dir/build.make:90: CMakeFiles/ph5_dataset.dir/ph5_file_utils.c.o] Error 1
thanks
Indeed, I did have to make just a few modifications for compilation, which could potentially affect results:
changes.patch (1.2 KB). Since the typedefs are already in hdf5.h, I just removed them.
ok, thanks, i did the same changes and it works on sequential machine, not on the cluster, with errors:
[skylake077:109320] *** Process received signal ***
[skylake077:109320] Signal: Segmentation fault (11)
[skylake077:109320] Signal code: (-6)
[skylake077:109320] Failing at address: 0x6d20001ab08
[skylake077:109320] [ 0] /lib64/libpthread.so.0(+0xf630)[0x2aaaabd3d630]
[skylake077:109320] [ 1] /lib64/libc.so.6(+0x154e1b)[0x2aaaac09ee1b]
[skylake077:109320] [ 2] /trinity/shared/apps/tr17.10/x86_64/openmpi-gcc112-psm2-4.0.5/lib/libopen-pal.so.40(opal_generic_simple_unpack+0x6e7)[0x2aaaac621097]
[skylake077:109320] [ 3] /trinity/shared/apps/tr17.10/x86_64/openmpi-gcc112-psm2-4.0.5/lib/libmpi.so.40(ompi_datatype_sndrcv+0x1df)[0x2aaaab149d0f]
[skylake077:109320] [ 4] /trinity/shared/apps/tr17.10/x86_64/openmpi-gcc112-psm2-4.0.5/lib/openmpi/mca_coll_basic.so(mca_coll_basic_scatterv_intra+0x14b)[0x2aaabf80a99b]
[skylake077:109320] [ 5] /trinity/shared/apps/tr17.10/x86_64/openmpi-gcc112-psm2-4.0.5/lib/libmpi.so.40(PMPI_Scatterv+0x170)[0x2aaaab1709e0]
[skylake077:109320] [ 6] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x634903]
[skylake077:109320] [ 7] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x639ada]
[skylake077:109320] [ 8] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x63b93b]
[skylake077:109320] [ 9] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x63c7c9]
[skylake077:109320] [10] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x44f1f0]
[skylake077:109320] [11] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x6079b2]
[skylake077:109320] [12] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x5f57d7]
[skylake077:109320] [13] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x43b9ed]
[skylake077:109320] [14] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x43eaf6]
[skylake077:109320] [15] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x4077ca]
[skylake077:109320] [16] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x40728b]
[skylake077:109320] [17] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2aaaabf6c555]
[skylake077:109320] [18] /home/ceyraud/projets/mpi_test_perf/build/ph5_dataset[0x406ea3]
[skylake077:109320] *** End of error message ***
on your side, have you tested it on a sequential machine or on a cluster?
thanks for your help
So far I’ve only tested this on my local machine. It’s possible there may be an issue with MPI_Scatterv
on that cluster’s version of OpenMPI. The algorithm for dealing with compressed data in parallel switches over to MPI_Scatterv
as the number of chunks involved increases, so that’s likely why you only see the issue as you increase the I/O size.