Efficient way to write large 4D datasets?

@brtnfld
Yes, the “H5D_ALLOC_MULTI_ROUND_ROBIN” way is exactly what we are looking for and which will probably apply to many CFD codes. :slight_smile: I strongly support this effort and would like to see it implemented as soon as possible! :wink: The “extent” feature is actually of no use for us, though.

In HDF5 chunking, each chunk is stored in a contiguous space in the file.
So, if the local array sizes nx, ny, and nz are the same among all MPI
processes and the HDF5 chunk size is set to nxxnyxnz, then there will
be no interleaved file writes and thus the communication cost in MPI-IO
will be significantly reduced.

@wkliao
We have a 4D dataset with nx×ny×nz×na. So if I do chunking and save our data in na datasets, this would not be interleaved? I understood this differently so far…

For 4D arrays, there will be p x na chunks stored in the file.
The number of interleaves is p x na, i.e. interleaved among chunks.
If chunking is not enabled, the number will be p x nyxnzxna.

For 4D datasets, you can set the chunk size to nx x ny x nz x na.
That will result in p chunks in the files.

@wkliao
So if I would have chunks of size nx×ny×nz×na and would write the whole data into one single dataset, then all interleaving would be avoided - and when reading I just see one large global data array of the size ngx×ngy×ngz×na? That would maybe help me already, even without new HDF5 features like H5Dwrite_multi?

Thats is correct.
h5ls or h5dump will show it is a single 4D dataset.

@wkliao
Thanks for clarification! I will then probably go that way…

Unfortunately, this did not become very clear to me from the documentation alone…

Let us know if it helps. I wasn’t able to get any performance benefit from chunking – maybe because it was hard to get a reasonable chunk size commensurate with the local domain shape and the lustre stripe size. Approx. 1MB chunks (aligned with lustre stripes) increased the 12GB file size by 19%.