HDF5 and parallelism with fork(2)

Hello!

In https://github.com/h5py/h5py/issues/934, user opens HDF5 file read-only, forks, and starts accessing the data from the child processes concurrently.
Unfortunately, this doesn’t work reliably, because lseek+read combinations used in HDF5 are not atomic. Still, fork followed by concurrent reads seems appealing and easy-to-use paradigm, free of complicated MPI or other type of inter-process coordination.

But what if HDF5 library used atomic
https://linux.die.net/man/3/pread
(and maybe https://linux.die.net/man/3/pwrite, for consistency)
if this interface is available?

Maybe that should be submitted as a low-urgency/experimental improvement suggestion?

Best wishes in 2019,
Andrey Paramonov

while use of pread and pwrite may make some sense, I am just curious if solution to “…opens HDF5 file read-only, forks, and starts access…” is to just change order of operations to “…forks, opens HDF5 file read-only and starts accessing…”? I mean, why is it so important to have HDF5 file handle maintain consistency across forks when underlying standard interfaces (fopen/fread/frwrite/fclose or open/read/write/close) don’t do that either? Do we know if there are any performance implications of pread/pwrite vs. read/write? Do we know if most implementations of pread/pwrite just turn around and use read/write? It seems like reducing from 2 system calls (e.g. seek followed by read) to one (pread) would be a performance benefit. But, I honestly don’t know.

Hello,

And Happy New Year!

while use of pread and pwrite may make some sense, I am just curious if
solution to “…opens HDF5 file read-only, forks, and starts access…” is
to just change order of operations to “…forks, opens HDF5 file read-only
and starts accessing…”?

Personally I can only speculate, but it seems reasonable if operation in
a forked process happens conditionally, depending on other content of
HDF5 file. In this case, re-opening file may inflict performance penalty
and sub-optimal caching. Theoretically, forking should be faster on
modern OSes.
Hopefully, original h5py issue submitter chimes in!

I mean, why is it so important to have HDF5 file
handle maintain consistency across forks when underlying standard
interfaces (fopen/fread/frwrite/fclose or open/read/write/close) don’t
do that either?

You are right that lseek+read is a standard interface but pread is
a standard interface as well :wink: HDF5 could use the latter to inherit
its good properties.

Do we know if there are any performance implications of
pread/pwrite vs. read/write? Do we know if most implementations of
pread/pwrite just turn around and use read/write? It seems like reducing
from 2 system calls (e.g. seek followed by read) to one (pread) would be
a performance benefit. But, I honestly don’t know.

I believe it wouldn’t be any slower, only a bit less portable.

Best wishes,
Andrey Paramonov

pread is meant for multi-threaded programs to have an atomic seek/read. The same for pwrite.

Ger

FYI…a need for this came up in another (HPC) context and because NNSA labs are currently working with THG on HPC specific enhancements, we’ve asked THG to invest some effort to address this.

If interested in a pre-release, have a look at the pread_vfd branch in their bitbucket repo or the live-clone on github

Apart from atomicity of file access of H5Dread/H5Dwrite, a very, very preliminary peek at performance in some specific tests shows some slight improvements too.

Cheers

It appears that if you’re using HDF5 1.10.5 or later, compiled on a system with pread and pwrite available, this should be resolved. The HDF5 1.10.5 release notes say:

Added a new option to enable/disable using pread/pwrite instead of read/write in the sec2, log, and core VFDs.

This option is enabled by default when pread/pwrite are detected.

Autotools: --enable-preadwrite
CMake: HDF5_ENABLE_PREADWRITE