Hi
I am facing a problem with data chunking using Lustre 2.10 ( and 2.6) filesystem using HDF5 1.10.1 in parallel mode.
I attached in my mail a simple C program which create immediately the crash caused at Line 94 trying
to create the dataset collectively.
I observed the crash when i simply set the chunk size to be the same as the dataset size. I know that this is one
of the "non recommended" setup according to your documentation ("PitFalls")
https://support.hdfgroup.org/HDF5/doc1.8/Advanced/Chunking/index.html
But leaving apart the performance penalty , it should not cause a complete crash of the program.
Furthermore testing the same program with the older HDF5 version 1.8.16 DO Not cause any crash on the same
Lustre 2.10 ( or 2.6 ) version. So it seems that something has been change in the data chunking implementation
between the two major HDF5 version 1.8.x and 1.10.x .
Could you please tell me what should be changed for the data chunk size in the program when using the new version HDF5 1.10.x?
Thanks in advance,
Denis Bertini
PS:
Here is the core dump that i observed as soon as i use more that one MPI process
H5Pcreate access succeed
H5Pcreate access succeed
-I- Chunk size 176000:
-I- Chunk size 176000:
[lxbk0341:39368] *** Process received signal ***
[lxbk0341:39368] Signal: Segmentation fault (11)
[lxbk0341:39368] Signal code: Address not mapped (1)
[lxbk0341:39368] Failing at address: (nil)
[lxbk0341:39368] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f7742122890]
[lxbk0341:39368] [ 1] /lustre/hebe/rz/dbertini/plasma/softw/lib/openmpi/mca_io_romio314.so(ADIOI_Flatten+0x1577)[0x7f772e8ac657]
[lxbk0341:39368] [ 2] /lustre/hebe/rz/dbertini/plasma/softw/lib/openmpi/mca_io_romio314.so(ADIOI_Flatten_datatype+0xe3)[0x7f772e8ad363]
[lxbk0341:39368] [ 3] /lustre/hebe/rz/dbertini/plasma/softw/lib/openmpi/mca_io_romio314.so(ADIO_Set_view+0x1fd)[0x7f772e8a2f5d]
[lxbk0341:39368] [ 4] /lustre/hebe/rz/dbertini/plasma/softw/lib/openmpi/mca_io_romio314.so(mca_io_romio314_dist_MPI_File_set_view+0x2f6)[0x7f772e889e06]
[lxbk0341:39368] [ 5] /lustre/hebe/rz/dbertini/plasma/softw/lib/openmpi/mca_io_romio314.so(mca_io_romio314_file_set_view+0x22)[0x7f772e883802]
[lxbk0341:39368] [ 6] /lustre/hebe/rz/dbertini/plasma/softw/lib/libmpi.so.40(MPI_File_set_view+0xdd)[0x7f77423bfb2d]
mytest.c (2.58 KB)