hdf5 1.8.15 issue with calling MPI_Comm_create_keyval after MPI_Finalized called

We are trying to update our installation of hdf5 from 1.8.14 to 1.8.15. I have a sequence of unit tests that repeatedly run a program with different parameters, checking the output of the program. This program is a C++ program that uses the hdf5 library in a serial manner (it doesn't use the parallel features of the library) but it also uses MPI. The unit tests are written in Python, and they may use h5py to check the output of the program. The unit tests will run the program using mpiexec locally with a few ranks. We have openmpi 1.8.1 installed. What I'm finding with hdf5 1.8.15, is that although each unit test seems to succeed, the whole script fails and I get some output:

*** The MPI_Comm_create_keyval() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[(null):16407] Local abort after MPI_FINALIZE completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
HDF5: infinite loop closing library
E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E

When I look through the hdf5 1.8.15 source code, I find in H5.c, the H5_init_library(void) function, that it does

     int mpi_initialized;
     MPI_Initialized(&mpi_initialized);
     if (mpi_initialized) {
         int key_val;
         if(MPI_SUCCESS != (mpi_code = MPI_Comm_create_keyval(MPI_NULL_COPY_FN, ...

I'm not sure that this is correct. I wrote a small program that did

   MPI_Init(&argc, &argv);
   MPI_Finalize();
   MPI_Initialized(&mpi_initialized);
   MPI_Finalized(&mpi_finalized);
   std::cout << "MPI_Finalize. mpi_initialized=" << mpi_initialized << " mpi_finalized=" << mpi_finalized << std::endl;

and I found that both mpi_initialized and mpi_finalized were 1 after I called MPI_Finalize. That is, I think H5_init_library() should check to see if MPI_Finalize has been called as well. I'm not sure what hdf5 call is triggering the error messages, it may happen when h5py unloads (we have h5py version 2.3.1).

I did try some simple programs that looked like

   MPI_Init(&argc, &argv);
   MPI_Finalize();
   H5Fcreate // or some other simple h5 function calls

and I couldn't recreate those error messages, so maybe there is something else wrong.

best,

David Schneider
software engineer
LCLS, SLAC

Hi David,

My assumption was that users would/should call MPI_Finalize() as the last thing in their program.
If you read the MPI 3.0 standard, p:361, line 28, MPI implementations may guarantee that only process 0 returns from the call. Granted that you can use process 0 to do HDF5 serial I/O.

From the HDF5 side, we shouldn't make MPI calls if MPI is not there and the user is doing serial I/O, so yes the checks should include both is_initialized and is_finalized. I will enter a bug report for this, but again, as per the standard, I advise against doing anything after MPI_Finalize() is called.

Thanks,
Mohamad

···

-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of David A. Schneider
Sent: Monday, June 08, 2015 12:02 PM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] hdf5 1.8.15 issue with calling MPI_Comm_create_keyval after MPI_Finalized called

We are trying to update our installation of hdf5 from 1.8.14 to 1.8.15.
I have a sequence of unit tests that repeatedly run a program with different parameters, checking the output of the program. This program is a C++ program that uses the hdf5 library in a serial manner (it doesn't use the parallel features of the library) but it also uses MPI.
The unit tests are written in Python, and they may use h5py to check the output of the program. The unit tests will run the program using mpiexec locally with a few ranks. We have openmpi 1.8.1 installed. What I'm finding with hdf5 1.8.15, is that although each unit test seems to succeed, the whole script fails and I get some output:

*** The MPI_Comm_create_keyval() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[(null):16407] Local abort after MPI_FINALIZE completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
HDF5: infinite loop closing library
E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E

When I look through the hdf5 1.8.15 source code, I find in H5.c, the
H5_init_library(void) function, that it does

     int mpi_initialized;
     MPI_Initialized(&mpi_initialized);
     if (mpi_initialized) {
         int key_val;
         if(MPI_SUCCESS != (mpi_code =
MPI_Comm_create_keyval(MPI_NULL_COPY_FN, ...

I'm not sure that this is correct. I wrote a small program that did

   MPI_Init(&argc, &argv);
   MPI_Finalize();
   MPI_Initialized(&mpi_initialized);
   MPI_Finalized(&mpi_finalized);
   std::cout << "MPI_Finalize. mpi_initialized=" << mpi_initialized << "
mpi_finalized=" << mpi_finalized << std::endl;

and I found that both mpi_initialized and mpi_finalized were 1 after I called MPI_Finalize. That is, I think H5_init_library() should check to see if MPI_Finalize has been called as well. I'm not sure what hdf5 call is triggering the error messages, it may happen when h5py unloads (we have h5py version 2.3.1).

I did try some simple programs that looked like

   MPI_Init(&argc, &argv);
   MPI_Finalize();
   H5Fcreate // or some other simple h5 function calls

and I couldn't recreate those error messages, so maybe there is something else wrong.

best,

David Schneider
software engineer
LCLS, SLAC

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5