Parallel file writing performance

gerard.henry · September 13, 2024, 2:01pm

parallel file writing performance

Hello,

I need to write a large table to a file (~600GB) and I’m using HDF5 and MPI. I should point out that this is new to me, so I’m probably missing something essential.
I’ve got a little test code here:

written from one of the examples found on the HDF5Group site, and what I don’t understand is that if I run it on my machine for an array with a size of 2^24 complexes, it takes less than a minute, and if I run it on cluster, after 1/4h I still don’t have the result!
On the sequential machine, I’m on a mechanical disk, and on cluster, the parallel file system is GPFS.
For the slurm script, I’ve reserved 2 nodes and all the memory for each node.
Does anyone have an idea?

Thanks in advance for help

hyoklee · September 13, 2024, 8:41pm

Hi, @gerard.henry !

There can be many reasons for failure.
This document will be a good starting point.

hdf5/release_docs/INSTALL_parallel at develop · HDFGroup/hdf5 (github.com)

Were you able to run t_mpi test successfully as the above document mentions?

gerard.henry · September 16, 2024, 3:54pm

i don’t know, because i work on a cluster that i don’t manage, i am not the administrator, and i have only small directives to launch my program. I have to ask if this information is available (and wait somebody answer unfortunately)

gerard.henry · September 25, 2024, 3:03pm

hello,
i received help and now it works. You can see the correct code here:

hth,

Gerard

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

Parallel file writing performance