MPIO, Fortran - C confusion

Hello,
I am currently using HDF5 1.8.0 and think that I found a bug.
I use the MPIO functionality of HDF5 via the fortran interface, namely the
following function opens a new hdf5 file:

---- FORTRAN ----
     17 subroutine openBinary_MPI(nameString, fileId, comm, info)
     18 use hdf5
     19 implicit none
     20
     21 integer fileId, error, comm, info
     22 integer(hid_t) plistId
     23 character*(*) nameString
     24
     25
     26 print *, 'comm:', comm
     27
     28 c setup file access with paralle I/O
     29 call h5open_f(error)
     30 call h5pcreate_f(H5P_FILE_ACCESS_F, plistId, error)
     31 call h5pset_fapl_mpio_f(plistId, comm, info,
     32 & error)
     33
     34 c create file collectively
     35 call h5fcreate_f(nameString, H5F_ACC_TRUNC_F, fileId, error,
     36 & access_prp = plistId)
     37
     38 c close property list
     39 call h5pclose_f(plistId, error)
     40
     41 end
---- FORTRAN ----

This function is called like this (from another fortran file): call
openBinary("name-goes-here", id, MPI_COMM_WORLD, MPI_INFO_NULL)
MPI is correctly initialized before calling this function.

When executing the program, I get the following error:

---- OUTPUT ----

xserve33:~/Documents/workspace/ftlm/heisenberg weigert$ ./heisftlm

1 comm: 0
2 comm2: 0
3 null: -1
4 comm: 0

5 null-comm: 0, comm: 0

HDF5-DIAG: Error detected in HDF5 (1.8.0) MPI-process 0:
  #000: H5FDmpio.c line 334 in H5Pset_fapl_mpio(): not a valid communicator
    major: Property lists
    minor: Inappropriate type
total nr. of iterations: 7

^Cforrtl: error (69): process interrupted (SIGINT)

---- OUTPUT ----

As you can see, I added some debugging info. 1 to 4 are prints that I
inserted in my fortran program. Note, that MPI_COMM_NULL == -1 in fortran.
Line 5 is interesting, because I added this in "hdf5-1.8.0/src/H5FDmpio.c" -
inside the function where the error is thrown. The point is, that
MPI_COMM_NULL in C seems to be 0 - where in fortran this seems to be a valid
value for MPI_COMM_WORLD - which means that the check triggered in
H5FDmpio.c seems to be a false positive.

I attached the config.log file to give you enough information about the
system we are running and the compilation options chosen.

The code I inserted in H5FDmpio.c:

---- C ----
    315 herr_t
    316 H5Pset_fapl_mpio(hid_t fapl_id, MPI_Comm comm, MPI_Info info)
    317 {
    318 H5FD_mpio_fapl_t fa;
    319 H5P_genplist_t *plist; /* Property list pointer */
    320 herr_t ret_value;
    321
    322 FUNC_ENTER_API(H5Pset_fapl_mpio, FAIL)
    323 H5TRACE3("e", "iMcMi", fapl_id, comm, info);
    324
    325 if(fapl_id == H5P_DEFAULT)
    326 HGOTO_ERROR(H5E_PLIST, H5E_BADVALUE, FAIL, "can't set values
in default property list")
    327
    328 /* Check arguments */
    329 if(NULL == (plist = (H5P_genplist_t *)H5P_object_verify(fapl_id,
H5P_FILE_ACCESS)))
    330 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a file access
list")
    331
    332 printf("\nnull-comm: %llu, comm: %llu\n\n", (long
long)MPI_COMM_NULL, (long long)comm);
    333 if(MPI_COMM_NULL == comm)
    334 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a valid
communicator")
    335
    336 /* Initialize driver specific properties */
    337 fa.comm = comm;
    338 fa.info = info;
    339
    340 /* duplication is done during driver setting. */
    341 ret_value= H5P_set_driver(plist, H5FD_MPIO, &fa);
    342
    343 done:
    344 FUNC_LEAVE_API(ret_value)
    345 }
---- C ----

Sorry for this lengthy e-mail and thank you very much for your help in
advance.

With regards,
Stefan Weigert.

config.log (187 KB)

Hi Stefan,

Hello,

I am currently using HDF5 1.8.0 and think that I found a bug.
I use the MPIO functionality of HDF5 via the fortran interface, namely the following function opens a new hdf5 file:

---- FORTRAN ----
     17 subroutine openBinary_MPI(nameString, fileId, comm, info)
     18 use hdf5
     19 implicit none
     20
     21 integer fileId, error, comm, info
     22 integer(hid_t) plistId
     23 character*(*) nameString
     24
     25
     26 print *, 'comm:', comm
     27
     28 c setup file access with paralle I/O
     29 call h5open_f(error)
     30 call h5pcreate_f(H5P_FILE_ACCESS_F, plistId, error)
     31 call h5pset_fapl_mpio_f(plistId, comm, info,
     32 & error)
     33
     34 c create file collectively
     35 call h5fcreate_f(nameString, H5F_ACC_TRUNC_F, fileId, error,
     36 & access_prp = plistId)
     37
     38 c close property list
     39 call h5pclose_f(plistId, error)
     40
     41 end
---- FORTRAN ----

This function is called like this (from another fortran file): call openBinary("name-goes-here", id, MPI_COMM_WORLD, MPI_INFO_NULL)
MPI is correctly initialized before calling this function.

When executing the program, I get the following error:

---- OUTPUT ----
xserve33:~/Documents/workspace/ftlm/heisenberg weigert$ ./heisftlm 1 comm: 0
2 comm2: 0
3 null: -1
4 comm: 0

5 null-comm: 0, comm: 0

HDF5-DIAG: Error detected in HDF5 (1.8.0) MPI-process 0:
  #000: H5FDmpio.c line 334 in H5Pset_fapl_mpio(): not a valid communicator
    major: Property lists
    minor: Inappropriate type
total nr. of iterations: 7

^Cforrtl: error (69): process interrupted (SIGINT)
---- OUTPUT ----

As you can see, I added some debugging info. 1 to 4 are prints that I inserted in my fortran program. Note, that MPI_COMM_NULL == -1 in fortran.
Line 5 is interesting, because I added this in "hdf5-1.8.0/src/H5FDmpio.c" - inside the function where the error is thrown. The point is, that MPI_COMM_NULL in C seems to be 0 - where in fortran this seems to be a valid value for MPI_COMM_WORLD - which means that the check triggered in H5FDmpio.c seems to be a false positive.

I attached the config.log file to give you enough information about the system we are running and the compilation options chosen.

  Hmm, looking in your config.log file, it looks like you are on a Mac XServe and are using LAM MPI (can't tell the version), is that correct? If so, you may need to work on the portability issues, since that's not a configuration we support.

  Also, in the config.log file, it looks like there is some issue with translating communicators between FORTRAN & C (search for MPI_Comm_c2f to see the errors reported). Have you run other MPI FORTRAN programs on this machine with this MPI library/mpicc?

  Quincey

···

On Mar 17, 2008, at 8:53 AM, Stefan Weigert wrote:

The code I inserted in H5FDmpio.c:

---- C ----
    315 herr_t
    316 H5Pset_fapl_mpio(hid_t fapl_id, MPI_Comm comm, MPI_Info info)
    317 {
    318 H5FD_mpio_fapl_t fa;
    319 H5P_genplist_t *plist; /* Property list pointer */
    320 herr_t ret_value;
    321
    322 FUNC_ENTER_API(H5Pset_fapl_mpio, FAIL)
    323 H5TRACE3("e", "iMcMi", fapl_id, comm, info);
    324
    325 if(fapl_id == H5P_DEFAULT)
    326 HGOTO_ERROR(H5E_PLIST, H5E_BADVALUE, FAIL, "can't set values in default property list")
    327
    328 /* Check arguments */
    329 if(NULL == (plist = (H5P_genplist_t *)H5P_object_verify(fapl_id, H5P_FILE_ACCESS)))
    330 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a file access list")
    331
    332 printf("\nnull-comm: %llu, comm: %llu\n\n", (long long)MPI_COMM_NULL, (long long)comm);
    333 if(MPI_COMM_NULL == comm)
    334 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a valid communicator")
    335
    336 /* Initialize driver specific properties */
    337 fa.comm = comm;
    338 fa.info = info;
    339
    340 /* duplication is done during driver setting. */
    341 ret_value= H5P_set_driver(plist, H5FD_MPIO, &fa);
    342
    343 done:
    344 FUNC_LEAVE_API(ret_value)
    345 }
---- C ----

Sorry for this lengthy e-mail and thank you very much for your help in advance.

With regards,
Stefan Weigert.

<config.log>----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hello again,
first of all, sorry that I did not reply earlier but I have been occupied by
other things.
To answer your questions:

   - no, the hdf5/fortran/testpar - tests do not work (i didn't try them
   before - sorry)
   - yes, other fortran/MPI programs work well

Your hint with the config.log was really helpfull. I tried that on the
command-line and it gave me the same error. This seems to be a weird error
with the libraries (they are compiled for ppc, x86-32 and x86-64). This
actually happens much too often on the mac... We need to support these
various kinds of architectures, as we use different machines in the same
cluster - this is also the main reason why we want to migrate to HDF5 for
our binary formats.
Anyhow, unfortunately I do not have permission to recompile such things, so
I downloaded OpenMPI which is stated to be the successor of LAM/MPI and gave
it a local try.

And yes! It now works as expected.
Thank you very much for your help.

Best regards,
Stefan Weigert.

···

On Tue, Mar 18, 2008 at 3:56 PM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Stefan,

On Mar 17, 2008, at 8:53 AM, Stefan Weigert wrote:

> Hello,
>
> I am currently using HDF5 1.8.0 and think that I found a bug.
> I use the MPIO functionality of HDF5 via the fortran interface,
> namely the following function opens a new hdf5 file:
>
> ---- FORTRAN ----
> 17 subroutine openBinary_MPI(nameString, fileId, comm,
> info)
> 18 use hdf5
> 19 implicit none
> 20
> 21 integer fileId, error, comm, info
> 22 integer(hid_t) plistId
> 23 character*(*) nameString
> 24
> 25
> 26 print *, 'comm:', comm
> 27
> 28 c setup file access with paralle I/O
> 29 call h5open_f(error)
> 30 call h5pcreate_f(H5P_FILE_ACCESS_F, plistId, error)
> 31 call h5pset_fapl_mpio_f(plistId, comm, info,
> 32 & error)
> 33
> 34 c create file collectively
> 35 call h5fcreate_f(nameString, H5F_ACC_TRUNC_F, fileId,
> error,
> 36 & access_prp = plistId)
> 37
> 38 c close property list
> 39 call h5pclose_f(plistId, error)
> 40
> 41 end
> ---- FORTRAN ----
>
> This function is called like this (from another fortran file): call
> openBinary("name-goes-here", id, MPI_COMM_WORLD, MPI_INFO_NULL)
> MPI is correctly initialized before calling this function.
>
> When executing the program, I get the following error:
>
> ---- OUTPUT ----
> xserve33:~/Documents/workspace/ftlm/heisenberg weigert$ ./heisftlm 1
> comm: 0
> 2 comm2: 0
> 3 null: -1
> 4 comm: 0
>
> 5 null-comm: 0, comm: 0
>
> HDF5-DIAG: Error detected in HDF5 (1.8.0) MPI-process 0:
> #000: H5FDmpio.c line 334 in H5Pset_fapl_mpio(): not a valid
> communicator
> major: Property lists
> minor: Inappropriate type
> total nr. of iterations: 7
>
> ^Cforrtl: error (69): process interrupted (SIGINT)
> ---- OUTPUT ----
>
> As you can see, I added some debugging info. 1 to 4 are prints that
> I inserted in my fortran program. Note, that MPI_COMM_NULL == -1 in
> fortran.
> Line 5 is interesting, because I added this in "hdf5-1.8.0/src/
> H5FDmpio.c" - inside the function where the error is thrown. The
> point is, that MPI_COMM_NULL in C seems to be 0 - where in fortran
> this seems to be a valid value for MPI_COMM_WORLD - which means that
> the check triggered in H5FDmpio.c seems to be a false positive.
>
> I attached the config.log file to give you enough information about
> the system we are running and the compilation options chosen.

        Hmm, looking in your config.log file, it looks like you are on a
Mac
XServe and are using LAM MPI (can't tell the version), is that
correct? If so, you may need to work on the portability issues, since
that's not a configuration we support.

       Also, in the config.log file, it looks like there is some issue
with
translating communicators between FORTRAN & C (search for MPI_Comm_c2f
to see the errors reported). Have you run other MPI FORTRAN programs
on this machine with this MPI library/mpicc?

       Quincey

> The code I inserted in H5FDmpio.c:
>
> ---- C ----
> 315 herr_t
> 316 H5Pset_fapl_mpio(hid_t fapl_id, MPI_Comm comm, MPI_Info info)
> 317 {
> 318 H5FD_mpio_fapl_t fa;
> 319 H5P_genplist_t *plist; /* Property list pointer */
> 320 herr_t ret_value;
> 321
> 322 FUNC_ENTER_API(H5Pset_fapl_mpio, FAIL)
> 323 H5TRACE3("e", "iMcMi", fapl_id, comm, info);
> 324
> 325 if(fapl_id == H5P_DEFAULT)
> 326 HGOTO_ERROR(H5E_PLIST, H5E_BADVALUE, FAIL, "can't
> set values in default property list")
> 327
> 328 /* Check arguments */
> 329 if(NULL == (plist = (H5P_genplist_t
> *)H5P_object_verify(fapl_id, H5P_FILE_ACCESS)))
> 330 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a
> file access list")
> 331
> 332 printf("\nnull-comm: %llu, comm: %llu\n\n", (long
> long)MPI_COMM_NULL, (long long)comm);
> 333 if(MPI_COMM_NULL == comm)
> 334 HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a
> valid communicator")
> 335
> 336 /* Initialize driver specific properties */
> 337 fa.comm = comm;
> 338 fa.info = info;
> 339
> 340 /* duplication is done during driver setting. */
> 341 ret_value= H5P_set_driver(plist, H5FD_MPIO, &fa);
> 342
> 343 done:
> 344 FUNC_LEAVE_API(ret_value)
> 345 }
> ---- C ----
>
> Sorry for this lengthy e-mail and thank you very much for your help
> in advance.
>
> With regards,
> Stefan Weigert.
>
> <
> config
> .log
> >
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.