Hello
I am trying to read an HDF5 file while it is being written by another process, without using SWMR mode. I understand that this may not be officially supported, but I would like to understand whether there is any safe or semi-safe workaround for my use case, or whether this is fundamentally impossible.
I will describe the setup, what works initially, and the errors I eventually hit.
System / environment
-
Platform: Windows 10 x64
-
Compiler: MSVC (Visual Studio 2022)
-
HDF5 version: 1.14.5 (from Conan)
-
Language: C++ (HDF5 C++ API + some direct C API calls)
-
Storage: local NTFS filesystem
-
Processes: separate writer process and reader process (no MPI)
Use case
-
One process writes time steps incrementally into an HDF5 file.
-
New groups, subgroups, and datasets are created as the simulation progresses.
-
Another process repeatedly opens the file, reads available data, then closes it.
-
The reader does not keep the file handle open permanently.
-
No SWMR mode (this is intentional for now).
Expected behavior (best case):
-
Reader may fail to see the newest data, but should not crash.
-
On retry, reader should eventually see newly added groups/datasets.
What I tried
Reader side
-
File opened read-only (
H5F_ACC_RDONLY) -
Tried:
-
disabling file locking
-
opening a fresh file handle on every retry
-
exponential backoff between retries
-
-
No cached HDF5 objects kept between iterations
Writer side
-
File opened with
H5F_ACC_RDWRor created withH5F_ACC_TRUNC -
Groups and datasets are created dynamically
-
H5Fflush(..., H5F_SCOPE_GLOBAL)is called after writes
Initially, this sometimes works:
-
Reader can read the file correctly for a few iterations
-
New time steps appear as expected
Errors encountered
After some time, the reader fails with hard HDF5 errors, not just “object not found”.
Example 1 — dataset read failure
H5Dread(): can't synchronously read data
...
H5FD_read(): addr overflow, addr = 3299221668, size = 8, eoa = 2922366596
major: Invalid arguments to routine
minor: Address overflowed
Example 2 — object header / metadata corruption (most worrying)
H5O_get_info(): unable to load object header
H5C__verify_len_eoa(): address of object past end of allocation
major: Object cache
minor: Bad value
At this point the reader crashes consistently even after the writer finishes, and I stay on the same process of the loop to try to read each time.
My understanding so far
From reading documentation and debugging:
-
These errors appear when the reader sees partially written metadata
-
The file’s EOA (end of allocation) changes while the reader is accessing metadata
-
HDF5 correctly rejects inconsistent object headers
My question
I fully understand that SWMR is the supported solution, but before fully restructuring my pipeline, I would like to ask:
-
Is there any documented or undocumented workaround to make this pattern safer without SWMR?
-
e.g. specific flush patterns
-
reopening file handles
-
metadata cache controls
-
-
Is it strictly required to freeze the file topology (pre-create all groups/datasets) if SWMR is not used?
-
Is the observed behavior expected, or am I misusing the API in a way that could be corrected?
I am not looking for guarantees of seeing the latest data — only to avoid hard failures and metadata corruption errors on the reader side.
Minimal example
I can provide a minimal reproducer (two small programs: writer + reader) if helpful.
Thank you for your time and for maintaining HDF5.
Any clarification on what is fundamentally unsupported vs. potentially workable would be greatly appreciated.
Best regards,
Younes
