Problems reading HDF5 files with IntelMPI?

Hi,

is anyone aware of troubles with PHDF5 and IntelMPI? A test code to
reads an HDF5 file in parallel has trouble when scaling if I run it with
IntelMPI, but no trouble if I run it, for example, with POE.

I'm using Intel compilers 13.0.1, IntelMPI 4.1.3.049, and HDF5 1.8.10

The code just reads a 800x800x800 HDF5 file, and the times I get for
reading it are:

128 procs - 0.7262E+01
1024 procs - 0.9815E+01
1280 procs - 0.9930E+01
1600 procs - ??? (it gest stalled…)

But the same code (compiled with the above modules), but submitted with
IBM's POE instead of IntelMPI has no trouble with 1600 procs (actually
no trouble at all with up to 4096 procs) and it reads the file in
0.8963E+01 secs.

Any help appreciated,

···

--
Ángel de Vicente
http://www.iac.es/galeria/angelv/
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en

Hi,

is anyone aware of troubles with PHDF5 and IntelMPI? A test code to
reads an HDF5 file in parallel has trouble when scaling if I run it with
IntelMPI, but no trouble if I run it, for example, with POE.

The curie web site says "Global File System" and "Lustre", so I don't know which one you're using.

If it's lustre, maybe this will help you:

https://press3.mcs.anl.gov/romio/2014/06/12/romio-and-intel-mpi/

==rob

···

On 10/27/2014 08:21 AM, Angel de Vicente wrote:

I'm using Intel compilers 13.0.1, IntelMPI 4.1.3.049, and HDF5 1.8.10

The code just reads a 800x800x800 HDF5 file, and the times I get for
reading it are:

128 procs - 0.7262E+01
1024 procs - 0.9815E+01
1280 procs - 0.9930E+01
1600 procs - ??? (it gest stalled…)

But the same code (compiled with the above modules), but submitted with
IBM's POE instead of IntelMPI has no trouble with 1600 procs (actually
no trouble at all with up to 4096 procs) and it reads the file in
0.8963E+01 secs.

Any help appreciated,

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Hi Rob,

Rob Latham <robl@mcs.anl.gov> writes:

···

On 10/27/2014 08:21 AM, Angel de Vicente wrote:

Hi,

is anyone aware of troubles with PHDF5 and IntelMPI? A test code to
reads an HDF5 file in parallel has trouble when scaling if I run it with
IntelMPI, but no trouble if I run it, for example, with POE.

The curie web site says "Global File System" and "Lustre", so I don't know which
one you're using.

If it's lustre, maybe this will help you:

https://press3.mcs.anl.gov/romio/2014/06/12/romio-and-intel-mpi/

thanks, but this issue is not happening in CURIE, but in MareNostrum,
which uses GPFS.
--
Ángel de Vicente
http://www.iac.es/galeria/angelv/
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en

Good to know. While intel mpi does not include any GPFS optimizations, there's really only one optimization that matters for GPFS writes: aligning ROMIO file domains to file system block boundaries.

Set the MPI-IO hint "striping_unit" to the GPFS block size.

  Setting MPI-IO hints through HDF5 requires property lists and some other gyrations. Here's a good example, except you would set different hints:

https://wickie.hlrs.de/platforms/index.php/MPI-IO#Adapting_HDF5.27s_MPI_I.2FO_parameters_to_prevent_locking_on_Lustre

Determinig the gpfs block size, if you don't know it already, is as simple as 'stat -f'

==rob

···

On 10/28/2014 04:28 AM, Angel de Vicente wrote:

Hi Rob,

Rob Latham <robl@mcs.anl.gov> writes:

On 10/27/2014 08:21 AM, Angel de Vicente wrote:

Hi,

is anyone aware of troubles with PHDF5 and IntelMPI? A test code to
reads an HDF5 file in parallel has trouble when scaling if I run it with
IntelMPI, but no trouble if I run it, for example, with POE.

The curie web site says "Global File System" and "Lustre", so I don't know which
one you're using.

If it's lustre, maybe this will help you:

https://press3.mcs.anl.gov/romio/2014/06/12/romio-and-intel-mpi/

thanks, but this issue is not happening in CURIE, but in MareNostrum,
which uses GPFS.

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Hi,

Rob Latham <robl@mcs.anl.gov> writes:

is anyone aware of troubles with PHDF5 and IntelMPI? A test code to
reads an HDF5 file in parallel has trouble when scaling if I run it with
IntelMPI, but no trouble if I run it, for example, with POE.

thanks, but this issue is not happening in CURIE, but in MareNostrum,
which uses GPFS.

Good to know. While intel mpi does not include any GPFS optimizations, there's
really only one optimization that matters for GPFS writes: aligning ROMIO file
domains to file system block boundaries.

Set the MPI-IO hint "striping_unit" to the GPFS block size.

But this problem is happening when reading a file, not writing it (in
any case, I have tried setting the striping_unit as well, but no
difference). So far I have no idea what is going on. ~1500 procs is
where the trouble begins, but the number of processors that breaks the
program is not fixed. I run it sucessfully with 1515 processors, then it
failed with 1480...

Any pointers appreciated,

···

--
Ángel de Vicente
http://www.iac.es/galeria/angelv/
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en

Set the MPI-IO hint "striping_unit" to the GPFS block size.

But this problem is happening when reading a file, not writing it

ah, it's right there in the subject. sorry about that.

>(in

any case, I have tried setting the striping_unit as well, but no
difference). So far I have no idea what is going on. ~1500 procs is
where the trouble begins, but the number of processors that breaks the
program is not fixed. I run it sucessfully with 1515 processors, then it
failed with 1480...

I suppose all one can do is get a backtrace from a few processors (by, for example, attaching to a hung process with gdb) and see if you are stuck in communication or if you are stuck in a case where the processes are making very many teeny-tiny read operations (so not stuck, but performing I/O so poorly as to be making imperceptible progress)

==rob

···

On 10/30/2014 08:57 AM, Angel de Vicente wrote:

Any pointers appreciated,

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Hi,

Rob Latham <robl@mcs.anl.gov> writes:

(in
any case, I have tried setting the striping_unit as well, but no
difference). So far I have no idea what is going on. ~1500 procs is
where the trouble begins, but the number of processors that breaks the
program is not fixed. I run it sucessfully with 1515 processors, then it
failed with 1480...

I suppose all one can do is get a backtrace from a few processors (by, for
example, attaching to a hung process with gdb) and see if you are stuck in
communication or if you are stuck in a case where the processes are making very
many teeny-tiny read operations (so not stuck, but performing I/O so poorly as
to be making imperceptible progress)

I will try to attach to some process and see if I can get somewhere, but
the issue seems definitely a communication one: I changed the program so
that no actual reading is done, just opening the file and closing it,
and still gets hung at the h5fopen_f call, so for some reason the file
cannot even get opened when I go beyond ~1500 procs...

Thanks,

···

--
Ángel de Vicente
http://www.iac.es/galeria/angelv/
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en