HDF5 parallel I/O errors

Hi,

Are there plans to upgrade MPI function calls from MPI-1 to MPI-2, still
supporting MPI-1 for somebody who would need it??

Reason I am asking is:

We have a mpi wrapper library which contains definitions of mpi functions &
some other typechecking etc before calling actual mpi function calls. Based
on what the user has chosen at run time i.e. which mpi (intel, hp, openmpi
etc), correct wrapper is picked (wrapper/intel/*.so, wrapper/hp/*.so,
wrapper/openmpi/*.so) which in turn is compiled with intel/hp/openmpi
libraries.

So the flow of any mpi function call is : application -> wrapper ->
intel/hp/openmpi

Now I have compiled PHDF5 after changes to use the wrapper stub(placeholder
needed for compilation of application code) instead of any specfic mpi.

Now while writing parallel collective IO with 2 nodes, it fails and the
stack is
MPI_Type_struct in H5S_mpio_hyper_type (H5Smpio.c) <- H5S_mpio_space_type
(H5Smpio.c) <- H5D_inter_collective_io (H5Dmpio.c) <-
H5D_contig_collective_write (H5Dmpio.c)

Seems the failure is due to MPI_LB & MPI_UB (defined also in wrapper but
runtime call to this datatype constant in user selected mpi lib). MPI-2
guidelines say that "these are deprecated and their use is awkward & error
prone".

And I am having a real hard time to figure out how to replace MPI_LB with
something appropriate.

Thanks
Saurabh

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/HDF5-parallel-I-O-errors-tp814338p814338.html
Sent from the hdf-forum mailing list archive at Nabble.com.

You can use MPI_Type_resized, but unfortunately a lot of ROMIO-based MPI-IO
implementations won't understand that type and will give an error.
That's slowly changing but it takes a while for ROMIO changes to
propagate everywhere. Implementations based on MPICH2-1.0.8 and newer
will understand that type.

==rob

···

On Thu, May 13, 2010 at 12:04:58AM -0700, sranjan wrote:

Now while writing parallel collective IO with 2 nodes, it fails and the
stack is
MPI_Type_struct in H5S_mpio_hyper_type (H5Smpio.c) <- H5S_mpio_space_type
(H5Smpio.c) <- H5D_inter_collective_io (H5Dmpio.c) <-
H5D_contig_collective_write (H5Dmpio.c)

Seems the failure is due to MPI_LB & MPI_UB (defined also in wrapper but
runtime call to this datatype constant in user selected mpi lib). MPI-2
guidelines say that "these are deprecated and their use is awkward & error
prone".

And I am having a real hard time to figure out how to replace MPI_LB with
something appropriate.

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Thanks for the info. When I replace my code to use MPI_Type_create_resized,
I am getting error on NFS, maybe because of ROMIO issue mentioned in the
reply. But this is what I doing.

block_length[0] = 0;
block_length[1] = 0;
block_length[2] = 0;

displacement[0] = 0;
displacement[1] = d[i].start * offset[i] * elmt_size;
displacement[2] = (MPI_Aint)elmt_size * max_xtent[i];

old_types[0] = MPI_LB;
old_types[1] = outer_type;
old_types[2] = MPI_UB;

#ifdef H5_HAVE_MPI2
     mpi_code = MPI_Type_create_resized(outer_type, /* old types */
                                         displacement[0], /* lower bound */
                                         displacement[2], /* extent */
                                         &inner_type); /* new type */
#else
     mpi_code = MPI_Type_struct ( 3, /* count */
                                   block_length, /* blocklengths */
                                   displacement, /* displacements */
                                   old_types, /* old types */
                                   &inner_type); /* new type */
#endif

So when I turn on HAVE_MPI2, then I get "Error: Unsupported datatype passed
to ADIOI_Count_contiguous_blocks" while running . When I compile without
this flag, then it works fine.

Is the usage for MPI_Type_create_resized fine in the above case?

Thanks
Saurabh

···

On Thu, May 13, 2010 at 8:22 PM, Rob Latham <robl@mcs.anl.gov> wrote:

On Thu, May 13, 2010 at 12:04:58AM -0700, sranjan wrote:
> Now while writing parallel collective IO with 2 nodes, it fails and the
> stack is
> MPI_Type_struct in H5S_mpio_hyper_type (H5Smpio.c) <- H5S_mpio_space_type
> (H5Smpio.c) <- H5D_inter_collective_io (H5Dmpio.c) <-
> H5D_contig_collective_write (H5Dmpio.c)
>
> Seems the failure is due to MPI_LB & MPI_UB (defined also in wrapper but
> runtime call to this datatype constant in user selected mpi lib). MPI-2
> guidelines say that "these are deprecated and their use is awkward &
error
> prone".
>
> And I am having a real hard time to figure out how to replace MPI_LB with
> something appropriate.

You can use MPI_Type_resized, but unfortunately a lot of ROMIO-based MPI-IO
implementations won't understand that type and will give an error.
That's slowly changing but it takes a while for ROMIO changes to
propagate everywhere. Implementations based on MPICH2-1.0.8 and newer
will understand that type.

==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Even on Panasas I am getting same error with HP-MPI2.3.1.

···

On Fri, Jul 30, 2010 at 11:05 AM, Saurabh Ranjan <saurabh.ranjan@ansys.com>wrote:

Thanks for the info. When I replace my code to use MPI_Type_create_resized,
I am getting error on NFS, maybe because of ROMIO issue mentioned in the
reply. But this is what I doing.

block_length[0] = 0;
block_length[1] = 0;
block_length[2] = 0;

displacement[0] = 0;
displacement[1] = d[i].start * offset[i] * elmt_size;
displacement[2] = (MPI_Aint)elmt_size * max_xtent[i];

old_types[0] = MPI_LB;
old_types[1] = outer_type;
old_types[2] = MPI_UB;

#ifdef H5_HAVE_MPI2
     mpi_code = MPI_Type_create_resized(outer_type, /* old types */
                                         displacement[0], /* lower bound
*/
                                         displacement[2], /* extent */
                                         &inner_type); /* new type */
#else
     mpi_code = MPI_Type_struct ( 3, /* count */
                                   block_length, /* blocklengths */
                                   displacement, /* displacements */
                                   old_types, /* old types */
                                   &inner_type); /* new type */
#endif

So when I turn on HAVE_MPI2, then I get "Error: Unsupported datatype passed
to ADIOI_Count_contiguous_blocks" while running . When I compile without
this flag, then it works fine.

Is the usage for MPI_Type_create_resized fine in the above case?

Thanks
Saurabh

On Thu, May 13, 2010 at 8:22 PM, Rob Latham <robl@mcs.anl.gov> wrote:

On Thu, May 13, 2010 at 12:04:58AM -0700, sranjan wrote:
> Now while writing parallel collective IO with 2 nodes, it fails and the
> stack is
> MPI_Type_struct in H5S_mpio_hyper_type (H5Smpio.c) <-
H5S_mpio_space_type
> (H5Smpio.c) <- H5D_inter_collective_io (H5Dmpio.c) <-
> H5D_contig_collective_write (H5Dmpio.c)
>
> Seems the failure is due to MPI_LB & MPI_UB (defined also in wrapper but
> runtime call to this datatype constant in user selected mpi lib). MPI-2
> guidelines say that "these are deprecated and their use is awkward &
error
> prone".
>
> And I am having a real hard time to figure out how to replace MPI_LB
with
> something appropriate.

You can use MPI_Type_resized, but unfortunately a lot of ROMIO-based
MPI-IO
implementations won't understand that type and will give an error.
That's slowly changing but it takes a while for ROMIO changes to
propagate everywhere. Implementations based on MPICH2-1.0.8 and newer
will understand that type.

==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org

http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I have no idea what version of ROMIO HP-MPI is based on but I bet it's
quite old. Looks like you're doing things right, but resized types
aren't supported.

The fix is quite simple. Perhaps you can open a support request with
HP and let them know about this defect?

==rob

···

On Fri, Jul 30, 2010 at 11:12:37AM +0530, Saurabh Ranjan wrote:

Even on Panasas I am getting same error with HP-MPI2.3.1.

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Block length is 1, sorry for typo in my earlier mail, here is the code which
fails on NFS and PanFS.

block_length[0] = 1;
block_length[1] = 1;
block_length[2] = 1;

displacement[0] = 0;
displacement[1] = d[i].start * offset[i] * elmt_size;
displacement[2] = (MPI_Aint)elmt_size * max_xtent[i];

old_types[0] = MPI_LB;
old_types[1] = outer_type;
old_types[2] = MPI_UB;

#ifdef H5_HAVE_MPI2
     mpi_code = MPI_Type_create_resized(outer_type, /* old types */
                                         displacement[0], /* lower bound */
                                         displacement[2], /* extent */
                                         &inner_type); /* new type */
#else
     mpi_code = MPI_Type_struct ( 3, /* count */
                                   block_length, /* blocklengths */
                                   displacement, /* displacements */
                                   old_types, /* old types */
                                   &inner_type); /* new type */
#endif

···

On Fri, Jul 30, 2010 at 11:12 AM, Saurabh Ranjan <saurabh.ranjan@ansys.com>wrote:

Even on Panasas I am getting same error with HP-MPI2.3.1.

On Fri, Jul 30, 2010 at 11:05 AM, Saurabh Ranjan <saurabh.ranjan@ansys.com > > wrote:

Thanks for the info. When I replace my code to
use MPI_Type_create_resized, I am getting error on NFS, maybe because of
ROMIO issue mentioned in the reply. But this is what I doing.

block_length[0] = 0;
block_length[1] = 0;
block_length[2] = 0;

displacement[0] = 0;
displacement[1] = d[i].start * offset[i] * elmt_size;
displacement[2] = (MPI_Aint)elmt_size * max_xtent[i];

old_types[0] = MPI_LB;
old_types[1] = outer_type;
old_types[2] = MPI_UB;

#ifdef H5_HAVE_MPI2
     mpi_code = MPI_Type_create_resized(outer_type, /* old types */
                                         displacement[0], /* lower bound
*/
                                         displacement[2], /* extent */
                                         &inner_type); /* new type */
#else
     mpi_code = MPI_Type_struct ( 3, /* count */
                                   block_length, /* blocklengths */
                                   displacement, /* displacements */
                                   old_types, /* old types */
                                   &inner_type); /* new type */
#endif

So when I turn on HAVE_MPI2, then I get "Error: Unsupported datatype
passed to ADIOI_Count_contiguous_blocks" while running . When I compile
without this flag, then it works fine.

Is the usage for MPI_Type_create_resized fine in the above case?

Thanks
Saurabh

On Thu, May 13, 2010 at 8:22 PM, Rob Latham <robl@mcs.anl.gov> wrote:

On Thu, May 13, 2010 at 12:04:58AM -0700, sranjan wrote:
> Now while writing parallel collective IO with 2 nodes, it fails and the
> stack is
> MPI_Type_struct in H5S_mpio_hyper_type (H5Smpio.c) <-
H5S_mpio_space_type
> (H5Smpio.c) <- H5D_inter_collective_io (H5Dmpio.c) <-
> H5D_contig_collective_write (H5Dmpio.c)
>
> Seems the failure is due to MPI_LB & MPI_UB (defined also in wrapper
but
> runtime call to this datatype constant in user selected mpi lib). MPI-2
> guidelines say that "these are deprecated and their use is awkward &
error
> prone".
>
> And I am having a real hard time to figure out how to replace MPI_LB
with
> something appropriate.

You can use MPI_Type_resized, but unfortunately a lot of ROMIO-based
MPI-IO
implementations won't understand that type and will give an error.
That's slowly changing but it takes a while for ROMIO changes to
propagate everywhere. Implementations based on MPICH2-1.0.8 and newer
will understand that type.

==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org

http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org