Compiling from source - make check fails "Testing shrinking large chunk"

So I’m on a Alma Linux 9.3 system, and have run ./configure for the HDF5-1.14.4-2 source. To install HDF5 on the system, I have downloaded the source files and configured using

./configure --with-zlib=${ZLIB_PATH} --prefix=${H5DIR} --enable-hl --enable-parallel --enable-shared

This ran okay. However, when I perform a make check or make check RUNPARALLEL='mpirun -oversubscribe' which I saw somewhere, everything mostly passes with the exception of 8 instances of
"Testing shrinking large chunk *FAILED*
This is an example of what the test output reads for a single instance of the error:

Testing datasets w/Single Chunk indexing                               PASSED
Testing shrinking large chunk                                         *FAILED*
HDF5-DIAG: Error detected in HDF5 (1.14.4-2) thread 0:
  #000: H5D.c line 2014 in H5Dset_extent(): unable to synchronously change a dataset's dimensions
    major: Dataset
    minor: Can't set value
  #001: H5D.c line 1988 in H5D__set_extent_api_common(): unable to set dataset extent
    major: Dataset
    minor: Can't set value
  #002: H5VLcallback.c line 2558 in H5VL_dataset_specific(): unable to execute dataset specific callback
    major: Virtual Object Layer
    minor: Can't operate on object
  #003: H5VLcallback.c line 2526 in H5VL__dataset_specific(): unable to execute dataset specific callback
    major: Virtual Object Layer
    minor: Can't operate on object
  #004: H5VLnative_dataset.c line 530 in H5VL__native_dataset_specific(): unable to set extent of dataset
    major: Dataset
    minor: Can't set value
  #005: H5Dint.c line 3161 in H5D__set_extent(): unable to remove chunks
    major: Dataset
    minor: Write failed
  #006: H5Dchunk.c line 6192 in H5D__chunk_prune_by_extent(): unable to write fill value
    major: Dataset
    minor: Write failed
  #007: H5Dchunk.c line 5811 in H5D__chunk_prune_fill(): unable to lock raw data chunk
    major: Dataset
    minor: Read failed
  #008: H5Dchunk.c line 4528 in H5D__chunk_lock(): unable to read raw data chunk
    major: Low-level I/O
    minor: Read failed
  #009: H5Fio.c line 98 in H5F_shared_block_read(): read through page buffer failed
    major: Low-level I/O
    minor: Read failed
  #010: H5PB.c line 694 in H5PB_read(): read through metadata accumulator failed
    major: Page Buffering
    minor: Read failed
  #011: H5Faccum.c line 248 in H5F__accum_read(): driver read request failed
    major: Low-level I/O
    minor: Read failed
  #012: H5FDint.c line 259 in H5FD_read(): driver read request failed
    major: Virtual File Layer
    minor: Read failed
  #013: H5FDsec2.c line 702 in H5FD__sec2_read(): file read failed: time = Fri Aug 30 09:39:58 2024
, filename = 'chunk_fast.h5', file descriptor = 4, errno = 14, error message = 'Bad address', buf = 0x3284a18, total read size = 2097152, bytes this sub-read = 2097152, bytes actually read = 18446744073709551615, offset = 0
    major: Low-level I/O
    minor: Read failed
*FAILED*
   at dsets.c:12041 in test_large_chunk_shrink()...
HDF5-DIAG: Error detected in HDF5 (1.14.4-2) thread 0:
  #000: H5D.c line 2014 in H5Dset_extent(): unable to synchronously change a dataset's dimensions
    major: Dataset
    minor: Can't set value
  #001: H5D.c line 1988 in H5D__set_extent_api_common(): unable to set dataset extent
    major: Dataset
    minor: Can't set value
  #002: H5VLcallback.c line 2558 in H5VL_dataset_specific(): unable to execute dataset specific callback
    major: Virtual Object Layer
    minor: Can't operate on object
  #003: H5VLcallback.c line 2526 in H5VL__dataset_specific(): unable to execute dataset specific callback
    major: Virtual Object Layer
    minor: Can't operate on object
  #004: H5VLnative_dataset.c line 530 in H5VL__native_dataset_specific(): unable to set extent of dataset
    major: Dataset
    minor: Can't set value
  #005: H5Dint.c line 3161 in H5D__set_extent(): unable to remove chunks
    major: Dataset
    minor: Write failed
  #006: H5Dchunk.c line 6192 in H5D__chunk_prune_by_extent(): unable to write fill value
    major: Dataset
    minor: Write failed
  #007: H5Dchunk.c line 5811 in H5D__chunk_prune_fill(): unable to lock raw data chunk
    major: Dataset
    minor: Read failed
  #008: H5Dchunk.c line 4528 in H5D__chunk_lock(): unable to read raw data chunk
    major: Low-level I/O
    minor: Read failed
  #009: H5Fio.c line 98 in H5F_shared_block_read(): read through page buffer failed
    major: Low-level I/O
    minor: Read failed
  #010: H5PB.c line 694 in H5PB_read(): read through metadata accumulator failed
    major: Page Buffering
    minor: Read failed
  #011: H5Faccum.c line 248 in H5F__accum_read(): driver read request failed
    major: Low-level I/O
    minor: Read failed
  #012: H5FDint.c line 259 in H5FD_read(): driver read request failed
    major: Virtual File Layer
    minor: Read failed
  #013: H5FDsec2.c line 702 in H5FD__sec2_read(): file read failed: time = Fri Aug 30 09:39:58 2024
, filename = 'chunk_fast.h5', file descriptor = 4, errno = 14, error message = 'Bad address', buf = 0x3284a18, total read size = 2097152, bytes this sub-read = 2097152, bytes actually read = 18446744073709551615, offset = 0
    major: Low-level I/O
    minor: Read failed

Would anyone have any idea what is going wrong, or if it is important? I have tried compiling without the --enable-parallel --enable-shared flags with identical outcome.

Thanks,
Jacob

Hi @jacob.svensmark,

it’s likely unrelated, but what type of file system are you running on? Based on the errno value, it looks like the library was trying to read into a memory buffer with an (offset, size) pair that would fall outside the allocated memory region. However, after looking at the relevant code I’m not sure how that would have happened since it’s a very short path between where the library allocates the buffer and then reads into it according to the size it allocated the buffer to. This may be a bug in the library’s memory management code, but I don’t believe we’ve seen this issue before.

The filesystem is fhgfs when I run the command

$ stat --file-system --format=%T /path/to/folder
fhgfs

EDIT: @jhenderson Okay that is interesting - I just moved the hdf5-1.14.4-3 folder somewhere else on the system where the file system is nfs, and the error does not show up anymore. So, the fhgfs is supposedly a fast file system of sorts. So, will it matter for me where I compile the hdf5 project, and where the binaries are stored, or does the speed has to do with the location of the data-file, not the binary? In other words, will my performance suffer from moving the installation to the nfs file system here?