HDF parallel test fail


#1

Hi, everyone,

I’m trying to install HDF5 parallel version for NETCDF. Configuration and installation went well, but it fails at the t_bigio test. I’m appreciated if anyone could help me with this.

I’m using openMPI 1.10.4, gnu 4.9.3, installing hdf5-1.10.4. The check test error is attached below.

===================================
MPI tests finished with no errors

0.87user 1.38system 0:01.21elapsed 186%CPU (0avgtext+0avgdata 123696maxresident)k
0inputs+128outputs (7major+84441minor)pagefaults 0swaps

Finished testing t_mpi

make[4]: Leaving directory /home/users/ntu/juntao00/hdf5-1.10.4/testpar' make[4]: Entering directory/home/users/ntu/juntao00/hdf5-1.10.4/testpar’

Testing t_bigio

t_bigio Test Log

Testing Dataset1 write by ROW

Testing Dataset2 write by COL

Testing Dataset3 write select ALL proc 0, NONE others

Testing Dataset4 write point selection

MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

[ntu02:26084] 2 more processes have sent help message help-mpi-api.txt / mpi-abort
[ntu02:26084] Set MCA parameter “orte_base_help_aggregate” to 0 to see all help / error messages
13.80user 3.49system 0:05.98elapsed 289%CPU (0avgtext+0avgdata 804020maxresident)k
0inputs+112outputs (7major+376166minor)pagefaults 0swaps
make[4]: *** [t_bigio.chkexe_] Error 1
make[4]: Leaving directory /home/users/ntu/juntao00/hdf5-1.10.4/testpar' make[3]: *** [build-check-p] Error 1 make[3]: Leaving directory/home/users/ntu/juntao00/hdf5-1.10.4/testpar’
make[2]: *** [test] Error 2
make[2]: Leaving directory /home/users/ntu/juntao00/hdf5-1.10.4/testpar' make[1]: *** [check-am] Error 2 make[1]: Leaving directory/home/users/ntu/juntao00/hdf5-1.10.4/testpar’
make: *** [check-recursive] Error 1


#2

Hi,

I’m having exactly the same problem with openMPI 1.10.4, hdf5-1.10.4 but compiling with gcc 7.3.0.

Can anyone offer any help or advice please?

Regards

Jimmy


#3

Hello Jimmy,

The issue you have encountered is a known issue in OpenMPI 1.10.x. You will need
to upgrade your version of OpenMPI or switch to MPICH.

The issue is due to a bug in the OpenMPI MPI datatype code that was fixed in
current versions of OpenMPI 2.1.x, 3.0.x, 3.1.x, and 4.0.x. Specifically, HDF5-1.10.x has been
successfully tested with these OpenMPI versions: 2.1.5, 3.0.3, 3.1.3, and 4.0

This page lists a couple of other OpenMPI issues in case you encounter them:
https://portal.hdfgroup.org/display/knowledge/OpenMPI+Build+Issues

-Barbara


#4

Hi, Barbara,

Thanks for your information.

Regards

Juntao


#5

Hi Barbara,

Thank you for your prompt update and information.

I’ll install a newer version of OpenMPI and try that.

Regards

Jimmy