Error makecheck HDF5

Hi all,

I am getting an error while doing make check on hdf5, the error is:

===================================
PHDF5 tests detected 1536 errors


Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.


mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[37185,1],0]
Exit code: 1

Command exited with non-zero status 1
0.77user 0.44system 0:01.50elapsed 81%CPU (0avgtext+0avgdata 185496maxresident)k
0inputs+70640outputs (2628major+56783minor)pagefaults 0swaps
make[4]: *** [Makefile:1577: testphdf5.chkexe_] Error 1
make[4]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/testpar’
make[3]: *** [Makefile:1710: build-check-p] Error 1
make[3]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/testpar’
make[2]: *** [Makefile:1558: test] Error 2
make[2]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/testpar’
make[1]: *** [Makefile:1358: check-am] Error 2
make[1]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/testpar’
make: *** [Makefile:729: check-recursive] Error 1

has anyone any idea what the issue is?

I really appreciate any thoughts,

Best,
Shima

What’s the OS/compiler/MPI version?
Can you capture the whole output (stdout and stderr) and attach? The large number of errors suggests that the tests are not running because of a resource error or missing dependencies.

G.

mpiexec (OpenRTE) 4.1.4
seems like i cannot attech the file. this is the link to a drive where i put the whole error:
https://drive.google.com/drive/folders/1jCIBEbMQ0_qkxxxY3WrOuC292MH-PeKk?usp=sharing

Thanks

It looks like things go sour in the atomicity test (around line 12041):

Testing  -- dataset atomic updates (atomicity) 
Testing  -- dataset atomic updates (atomicity) 
Testing  -- dataset atomic updates (atomicity) 
Testing  -- dataset atomic updates (atomicity) 
Testing  -- dataset atomic updates (atomicity) 
Testing  -- dataset atomic updates (atomicity) 
Atomicity Test Failed Process 2: read_buf[1536] is 0, should be 5
Atomicity Test Failed Process 2: read_buf[1537] is 0, should be 5
Atomicity Test Failed Process 2: read_buf[1538] is 0, should be 5
Atomicity Test Failed Process 2: read_buf[1539] is 0, should be 5
Atomicity Test Failed Process 2: read_buf[1540] is 0, should be 5
Atomicity Test Failed Process 2: read_buf[1541] is 0, should be 5
...

What kind of file system are you running this on?

G.

It’s also weird that the script says

All tests were successful.

...

===================================
***PHDF5 tests detected 512 errors***
===================================

That’s maybe our weird sense of humor. :thinking:

G.

This is the file system that I am running:
[skasaei@lange testpar]$ df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
devtmpfs devtmpfs 4096 0 4096 0% /dev
tmpfs tmpfs 98313056 0 98313056 0% /dev/shm
tmpfs tmpfs 39325224 33952 39291272 1% /run
/dev/sdc4 xfs 451400192 15097848 436302344 4% /
/dev/sdc2 xfs 957440 313736 643704 33% /boot
/dev/sdc1 vfat 97062 7114 89948 8% /boot/efi
/dev/sdb1 ext4 6974131456 32 6622580016 1% /BIG_2_5
/dev/sda1 ext4 6974131456 2579700 6620000348 1% /BIG_6_9
tmpfs tmpfs 19662608 56 19662552 1% /run/user/42
tmpfs tmpfs 19662608 40 19662568 1% /run/user/0
tmpfs tmpfs 19662608 40 19662568 1% /run/user/1004

any idea how to solve the issue?

I have done make check successfully by disabling the parallel during the build:
./configure --disable -parallel --prefix <…>
then make
and make check, which seems successful.

I would like to install NetCDF and COAWST-Roms and use in in parallel later.
Do you think I will face any issues using it in parallel later on because of this “disabling”?

What was the compiler used to build mpiexec (OpenRTE) 4.1.4. This may not be the problem, but older GCC versions don’t support atomic operations.

ldd mpiexec
linux-vdso.so.1 (0x00007fffc8b3d000)
libopen-rte.so.40 => /opt/opt_shared/openmpi-4.1.4/lib/libopen-rte.so.40 (0x00007ff42b50a000)
libopen-pal.so.40 => /opt/opt_shared/openmpi-4.1.4/lib/libopen-pal.so.40 (0x00007ff42b401000)
libm.so.6 => /lib64/libm.so.6 (0x00007ff42b31a000)
libz.so.1 => /lib64/libz.so.1 (0x00007ff42b300000)
libc.so.6 => /lib64/libc.so.6 (0x00007ff42b0f7000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff42b5c6000)

Also the gcc version is: gcc (GCC) 11.3.1 20220421 (Red Hat 11.3.1-2)

I am done with “make” and “make check” successfully by disabling the parallel in ./configure. (echo $? outputs 0 after each command, which means successfulness)
but I am facing an issue when I do make install:

[skasaei@lange hdf5-1.14.0]$ make install
Making install in src
make[1]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/src’
make[2]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/src’
/usr/bin/mkdir -p ‘/opt/opt_shared/hdf5-1.14.0/lib’
/bin/sh …/libtool --mode=install /usr/bin/install -c libhdf5.la ‘/opt/opt_shared/hdf5-1.14.0/lib’
libtool: install: /usr/bin/install -c .libs/libhdf5.so.310.0.0 /opt/opt_shared/hdf5-1.14.0/lib/libhdf5.so.310.0.0
libtool: install: (cd /opt/opt_shared/hdf5-1.14.0/lib && { ln -s -f libhdf5.so.310.0.0 libhdf5.so.310 || { rm -f libhdf5.so.310 && ln -s libhdf5.so.310.0.0 libhdf5.so.310; }; })
libtool: install: (cd /opt/opt_shared/hdf5-1.14.0/lib && { ln -s -f libhdf5.so.310.0.0 libhdf5.so || { rm -f libhdf5.so && ln -s libhdf5.so.310.0.0 libhdf5.so; }; })
libtool: install: /usr/bin/install -c .libs/libhdf5.lai /opt/opt_shared/hdf5-1.14.0/lib/libhdf5.la
libtool: install: /usr/bin/install -c .libs/libhdf5.a /opt/opt_shared/hdf5-1.14.0/lib/libhdf5.a
libtool: install: chmod 644 /opt/opt_shared/hdf5-1.14.0/lib/libhdf5.a
libtool: install: ranlib /opt/opt_shared/hdf5-1.14.0/lib/libhdf5.a
libtool: finish: PATH="/opt/opt_shared/openmpi-4.1.4/bin:/home/skasaei/.local/bin:/home/skasaei/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin" ldconfig -n /opt/opt_shared/hdf5-1.14.0/lib

Libraries have been installed in:
/opt/opt_shared/hdf5-1.14.0/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the ‘-LLIBDIR’
flag during linking and do at least one of the following:

  • add LIBDIR to the ‘LD_LIBRARY_PATH’ environment variable
    during execution
  • add LIBDIR to the ‘LD_RUN_PATH’ environment variable
    during linking
  • use the ‘-Wl,-rpath -Wl,LIBDIR’ linker flag
  • have your system administrator add LIBDIR to ‘/etc/ld.so.conf’

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.

/usr/bin/mkdir -p ‘/opt/opt_shared/hdf5-1.14.0/include’
/usr/bin/install -c -m 644 hdf5.h H5api_adpt.h H5overflow.h H5pubconf.h H5public.h H5version.h H5Apublic.h H5ACpublic.h H5Cpublic.h H5Dpublic.h H5Epubgen.h H5Epublic.h H5ESpublic.h H5Fpublic.h H5FDpublic.h H5FDcore.h H5FDdirect.h H5FDfamily.h H5FDhdfs.h H5FDlog.h H5FDmirror.h H5FDmpi.h H5FDmpio.h H5FDmulti.h H5FDonion.h H5FDros3.h H5FDsec2.h H5FDsplitter.h H5FDstdio.h H5FDsubfiling/H5FDsubfiling.h H5FDsubfiling/H5FDioc.h H5FDwindows.h H5Gpublic.h H5Ipublic.h H5Lpublic.h H5Mpublic.h H5MMpublic.h H5Opublic.h H5Ppublic.h H5PLextern.h ‘/opt/opt_shared/hdf5-1.14.0/include’
/usr/bin/install -c -m 644 H5PLpublic.h H5Rpublic.h H5Spublic.h H5Tpublic.h H5VLconnector.h H5VLconnector_passthru.h H5VLnative.h H5VLpassthru.h H5VLpublic.h H5Zpublic.h H5ESdevelop.h H5FDdevelop.h H5Idevelop.h H5Ldevelop.h H5Tdevelop.h H5TSdevelop.h H5Zdevelop.h ‘/opt/opt_shared/hdf5-1.14.0/include’
/usr/bin/mkdir -p ‘/opt/opt_shared/hdf5-1.14.0/lib’
/usr/bin/install -c -m 644 libhdf5.settings ‘/opt/opt_shared/hdf5-1.14.0/lib’
make[2]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/src’
make[1]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/src’
Making install in test
make[1]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/test’
make[2]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/test’
make[2]: Nothing to be done for ‘install-exec-am’.
make[2]: Nothing to be done for ‘install-data-am’.
make[2]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/test’
make[1]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/test’
Making install in bin
make[1]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/bin’
make[2]: Entering directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/bin’
/usr/bin/mkdir -p ‘/opt/opt_shared/hdf5-1.14.0/bin’
/usr/bin/install -c h5redeploy ‘/opt/opt_shared/hdf5-1.14.0/bin’
/usr/bin/install: ‘h5redeploy’ and ‘/opt/opt_shared/hdf5-1.14.0/bin/h5redeploy’ are the same file
make[2]: *** [Makefile:843: install-binSCRIPTS] Error 1
make[2]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/bin’
make[1]: *** [Makefile:1078: install-am] Error 2
make[1]: Leaving directory ‘/BIG_6_9/opt_shared/hdf5-1.14.0/bin’
make: *** [Makefile:729: install-recursive] Error 1

Any idea what is the reason?